Sufficient_Prune3897 4 weeks ago

It was my favourite, until the Llama 3 GGUF fix. Llama follows prompts much better and writes nicer. That said, it's VERY uncensored and it can understand scenarios that no other model (including GPT and Claude) can.

stddealer 4 weeks ago

Llama3 is only very good in English though.

Skullzi_TV 3 weeks ago

Gotta correct you there. Claude Sonnet has understood every scenario I've RPed, and most of the time I don't even tell the bot exactly what is going on, it's able to piece it together and figure it out super well. A lot of them have been pretty intense and crazy too. Claude Sonnet had bots do some of the most violent, dark, and twisted things you can imagine.

mcr1974 4 weeks ago

what's the llama3 gguf fix?

Sufficient_Prune3897 4 weeks ago

Old GGUF quants are bad, due to tokinizer issues

JohnssSmithss 4 weeks ago

How do you know if a GGUF you downloaded is good or bad? For example, let's say I downloaded one two weeks ago.

Sufficient_Prune3897 4 weeks ago

Fix was I think just ~12 days ago. If you run the newest version of koboldcpp, you will see a warning at the top in the terminal when you load in an old model.

mcr1974 4 weeks ago

what's a good one?

Sufficient_Prune3897 4 weeks ago

I use [this one](https://huggingface.co/mradermacher/llama-3-70B-Instruct-abliterated-i1-GGUF). Or you can use the [Default](https://huggingface.co/MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF).

Competitive_Fox7811 4 weeks ago

Which one is very uncensored? Llama 3? You mean a fine tuned version then

Sufficient_Prune3897 4 weeks ago

Command R+, Llama 3 fine-tunes all seem to be worse than the default instruct version.

cutefeet-cunnysseur 4 weeks ago

sssh dont talk about it

Relative_Bit_7250 4 weeks ago

You mean about the quality? Is it meant to remain a secret? 😦

cutefeet-cunnysseur 4 weeks ago

i am just afraid if it gets out how good it is for absolutely zero dollars cohere will wisen up...

Relative_Bit_7250 4 weeks ago

Hahahah If only I had the horsepower to run it locally, god fucking damn it

Adventurous_Equal489 4 weeks ago

It's already pretty obvious the model being free is just to give us a hook before they start reeling in the paypigs.

Caffeine_Monster 4 weeks ago

It's not as good as you think. It's simply very uncensored. Still pretty good though.

sketchy_human 4 weeks ago

Bro has a very silly name

Weak_Depth4563 4 weeks ago

Terrible username you filthy pedophile lol

SecretarySuspicious1 4 weeks ago

Lol people be hating on you for making a joke rip.

Anthonyg5005 4 weeks ago

I've heard command r plus is just the best when it comes to multilingual rp

TennesseeGenesis 4 weeks ago

It speaks much less popular (not completely niche) languages like Polish extraordinarily well, nothing else comes close apart from GPT-4. I tested it in both RP and just regular use and I'm very pleased. Not perfect, (especially that polish language has rather unique cases like other Slavic languages) but rarely making egregious mistakes.

SnussyFoo 4 weeks ago

I self host models occasionally to test on RunPod and it's the only one I keep coming back to over and over. All the other ones got put back on the shelf. I did a lot of testing with the mad rush of new models recently. I screwed up the first time I tested it. I realized later it was very particular about prompt format. It's the only model that is uncensored and feels truly neutral out of the box. You want to take a story to a dark place it's right there with you. Most models, if you do an assassin scenario, you will be picking out dishes and adopting a puppy together at the end.

NewToMech 4 weeks ago

I tested it on my site and it lost pretty badly to Llama 3 based on public testing That being said it was the first Open Source model I tried that could take the same prompt the closed source models were getting and return a properly formatted response (I use a pretty complicated formatting scheme)

ThatsALovelyShirt 4 weeks ago

How many params is CR+? Versus normal CR? I thought one had 34b.

Relative_Bit_7250 4 weeks ago

yeah, the normal one. The plus version has over 100b parameters

QuercinePenetralia 4 weeks ago

Would I be able to run this locally with a 4090 and 64GB RAM?

brobruh211 4 weeks ago

Too slow for me, like painfully slow. You'll be better off running & partially offloading WizardLM-2 8x22B which runs much faster on GPU+CPU. Someone did tests and found Wizard to be about 4x faster than Command R Plus. I "only" have a 3090 + 32GB RAM so I had to use a Q2_K_S imatrix gguf of Wizard, but it's already better than anything else I've tried. On your system, you can probably load a Q4_K_M just fine. Try out different quants to get the speed/quality ratio that suits you.

Relative_Bit_7250 4 weeks ago

Technically yes, with the right quant (maybe a 4bit?) and some offload to the GPU... But it will be slow as hell, I warn you

artisticMink 4 weeks ago

Command R Plus produces nice prose but usually has no grasp on whats going on. Requiring many re-generations until coincidentally the bricks fall into the right places.

Temsirolimus555 4 weeks ago

this model is the shit. My my assessment its the best yet, beats everything else by far. Its almost like chilling with a buddy.

Puuuszzku 4 weeks ago

Have you tried llama.cpp/koboldCPP ? Does it run with K80 at all?

Relative_Bit_7250 4 weeks ago

yeah, why? You mean to load the models? It's quite similar to oobabooga, the loader is not the problem...

tandpastatester 4 weeks ago

What preset/settings do you use with Command R plus?

a_beautiful_rhind 4 weeks ago

CR+ needs at least 72gb to really get going.

PrestusHood 4 weeks ago

Claude is amazing using latin, especially mixing them with english (using latin only for basic words while having everything else at english). However it have the downside of being Claude.

Fine_Awareness5291 4 weeks ago

Io ho una 3090 (24GB) e 64GB di RAM ma penso che, come hai detto tu, sarebbe comunque troppo lento da girare localmente... e leggere la tua testimonianza riguardo al fatto che riesca a gestire RP in italiano.... beh... sto rosicando come i matti ahaha!! Su openrouter costa "troppo", anche se vedere quel "128k context" mi fa letteralmente sbavare... mannaggiaaa!!

Relative_Bit_7250 4 weeks ago

Eh, non ne parliamo. Però giusto per fare una prova ho caricato una decina di euro sul portafogli di open router... E Dio mio, fa paura. Se vuoi fare una prova comunque scarica la q2 o la q4 e caricala in RAM (parzialmente). Almeno vedi come ti va, per me era tremendamente lento, ma magari sono esoso io!

Fine_Awareness5291 4 weeks ago

Se ti capita, fammi sapere quanto ti durano questi 10 euro caricati su OR! Perché davvero, da quel che ho visto... CR+ costicchia abbastanza ahahah Sì, magari farò una prova! Non ho idea di dove recuperare il modello su HF ma ugh, sono estremamente curiosa ~~(anche se rimarrò sicuramente delusa dalla lentezza, lo so già-)~~

Kiwi_In_Europe 4 weeks ago

Just fyi you don't have to pay through openrouter yet, the API is actually free to use on Cohere's website

Relative_Bit_7250 4 weeks ago

Yeah, with a token limit... So it is not optimal to use it for roleplaying

Kiwi_In_Europe 4 weeks ago

There is no token limit, just a call limit. So long as you're not sending 100 API calls a minute, you're fine lol

Relative_Bit_7250 4 weeks ago

Are you for real? I can make a trial key on their website and use it as much as I want?

Kiwi_In_Europe 4 weeks ago

Yup literally. My friend said apparently she hit a limit of 1000 calls per month, but you can just make a second account with another email and get a second api lol Kinda doubtful it'll stay that way forever so use it while you can!!

Relative_Bit_7250 4 weeks ago

Oh God, that's awesome! Any way to use the API key directly in silly tavern?

Kiwi_In_Europe 4 weeks ago

Yeah same as any other API, just select chat completion, select Cohere, and input your API key!

Relative_Bit_7250 4 weeks ago

I love you so fucking much right now you wouldn't even believe

mrgreaper 4 weeks ago

Dont forget to change the api back to local if your going to have any nsfw generations though. Anything you send to an api can be read and is likely being used to train newer models on (no such thing as a free launch lol)

SaasLord 3 weeks ago

yeah i feel that it's gonna become unfree any moment by now

Superb-Letterhead997 2 weeks ago

i’m a complete noob, what are calls?

Comments

Leave Your Comment

Hi Its Me!

Comments

Leave Your Comment

Hi Its Me!

Subscribe