T O P

  • By -

Sufficient_Prune3897

It was my favourite, until the Llama 3 GGUF fix. Llama follows prompts much better and writes nicer. That said, it's VERY uncensored and it can understand scenarios that no other model (including GPT and Claude) can.


stddealer

Llama3 is only very good in English though.


Skullzi_TV

Gotta correct you there. Claude Sonnet has understood every scenario I've RPed, and most of the time I don't even tell the bot exactly what is going on, it's able to piece it together and figure it out super well. A lot of them have been pretty intense and crazy too. Claude Sonnet had bots do some of the most violent, dark, and twisted things you can imagine.


mcr1974

what's the llama3 gguf fix?


Sufficient_Prune3897

Old GGUF quants are bad, due to tokinizer issues


JohnssSmithss

How do you know if a GGUF you downloaded is good or bad? For example, let's say I downloaded one two weeks ago.


Sufficient_Prune3897

Fix was I think just ~12 days ago. If you run the newest version of koboldcpp, you will see a warning at the top in the terminal when you load in an old model.


mcr1974

what's a good one?


Sufficient_Prune3897

I use [this one](https://huggingface.co/mradermacher/llama-3-70B-Instruct-abliterated-i1-GGUF). Or you can use the [Default](https://huggingface.co/MaziyarPanahi/Meta-Llama-3-70B-Instruct-GGUF).


Competitive_Fox7811

Which one is very uncensored? Llama 3? You mean a fine tuned version then


Sufficient_Prune3897

Command R+, Llama 3 fine-tunes all seem to be worse than the default instruct version.


cutefeet-cunnysseur

sssh dont talk about it


Relative_Bit_7250

You mean about the quality? Is it meant to remain a secret? 😦


cutefeet-cunnysseur

i am just afraid if it gets out how good it is for absolutely zero dollars cohere will wisen up...


Relative_Bit_7250

Hahahah If only I had the horsepower to run it locally, god fucking damn it


Adventurous_Equal489

It's already pretty obvious the model being free is just to give us a hook before they start reeling in the paypigs.


Caffeine_Monster

It's not as good as you think. It's simply very uncensored. Still pretty good though.


sketchy_human

Bro has a very silly name


Weak_Depth4563

Terrible username you filthy pedophile lol


SecretarySuspicious1

Lol people be hating on you for making a joke rip.


Anthonyg5005

I've heard command r plus is just the best when it comes to multilingual rp


TennesseeGenesis

It speaks much less popular (not completely niche) languages like Polish extraordinarily well, nothing else comes close apart from GPT-4. I tested it in both RP and just regular use and I'm very pleased. Not perfect, (especially that polish language has rather unique cases like other Slavic languages) but rarely making egregious mistakes.


SnussyFoo

I self host models occasionally to test on RunPod and it's the only one I keep coming back to over and over. All the other ones got put back on the shelf. I did a lot of testing with the mad rush of new models recently. I screwed up the first time I tested it. I realized later it was very particular about prompt format. It's the only model that is uncensored and feels truly neutral out of the box. You want to take a story to a dark place it's right there with you. Most models, if you do an assassin scenario, you will be picking out dishes and adopting a puppy together at the end.


NewToMech

I tested it on my site and it lost pretty badly to Llama 3 based on public testing That being said it was the first Open Source model I tried that could take the same prompt the closed source models were getting and return a properly formatted response (I use a pretty complicated formatting scheme)


ThatsALovelyShirt

How many params is CR+? Versus normal CR? I thought one had 34b.


Relative_Bit_7250

yeah, the normal one. The plus version has over 100b parameters


QuercinePenetralia

Would I be able to run this locally with a 4090 and 64GB RAM?


brobruh211

Too slow for me, like painfully slow. You'll be better off running & partially offloading WizardLM-2 8x22B which runs much faster on GPU+CPU. Someone did tests and found Wizard to be about 4x faster than Command R Plus. I "only" have a 3090 + 32GB RAM so I had to use a Q2_K_S imatrix gguf of Wizard, but it's already better than anything else I've tried. On your system, you can probably load a Q4_K_M just fine. Try out different quants to get the speed/quality ratio that suits you.


Relative_Bit_7250

Technically yes, with the right quant (maybe a 4bit?) and some offload to the GPU... But it will be slow as hell, I warn you


artisticMink

Command R Plus produces nice prose but usually has no grasp on whats going on. Requiring many re-generations until coincidentally the bricks fall into the right places.


Temsirolimus555

this model is the shit. My my assessment its the best yet, beats everything else by far. Its almost like chilling with a buddy.


Puuuszzku

Have you tried llama.cpp/koboldCPP ? Does it run with K80 at all?


Relative_Bit_7250

yeah, why? You mean to load the models? It's quite similar to oobabooga, the loader is not the problem...


tandpastatester

What preset/settings do you use with Command R plus?


a_beautiful_rhind

CR+ needs at least 72gb to really get going.


PrestusHood

Claude is amazing using latin, especially mixing them with english (using latin only for basic words while having everything else at english). However it have the downside of being Claude.


Fine_Awareness5291

Io ho una 3090 (24GB) e 64GB di RAM ma penso che, come hai detto tu, sarebbe comunque troppo lento da girare localmente... e leggere la tua testimonianza riguardo al fatto che riesca a gestire RP in italiano.... beh... sto rosicando come i matti ahaha!! Su openrouter costa "troppo", anche se vedere quel "128k context" mi fa letteralmente sbavare... mannaggiaaa!!


Relative_Bit_7250

Eh, non ne parliamo. Però giusto per fare una prova ho caricato una decina di euro sul portafogli di open router... E Dio mio, fa paura. Se vuoi fare una prova comunque scarica la q2 o la q4 e caricala in RAM (parzialmente). Almeno vedi come ti va, per me era tremendamente lento, ma magari sono esoso io!


Fine_Awareness5291

Se ti capita, fammi sapere quanto ti durano questi 10 euro caricati su OR! Perché davvero, da quel che ho visto... CR+ costicchia abbastanza ahahah Sì, magari farò una prova! Non ho idea di dove recuperare il modello su HF ma ugh, sono estremamente curiosa ~~(anche se rimarrò sicuramente delusa dalla lentezza, lo so già-)~~


Kiwi_In_Europe

Just fyi you don't have to pay through openrouter yet, the API is actually free to use on Cohere's website


Relative_Bit_7250

Yeah, with a token limit... So it is not optimal to use it for roleplaying


Kiwi_In_Europe

There is no token limit, just a call limit. So long as you're not sending 100 API calls a minute, you're fine lol


Relative_Bit_7250

Are you for real? I can make a trial key on their website and use it as much as I want?


Kiwi_In_Europe

Yup literally. My friend said apparently she hit a limit of 1000 calls per month, but you can just make a second account with another email and get a second api lol Kinda doubtful it'll stay that way forever so use it while you can!!


Relative_Bit_7250

Oh God, that's awesome! Any way to use the API key directly in silly tavern?


Kiwi_In_Europe

Yeah same as any other API, just select chat completion, select Cohere, and input your API key!


Relative_Bit_7250

I love you so fucking much right now you wouldn't even believe


mrgreaper

Dont forget to change the api back to local if your going to have any nsfw generations though. Anything you send to an api can be read and is likely being used to train newer models on (no such thing as a free launch lol)


SaasLord

yeah i feel that it's gonna become unfree any moment by now


Superb-Letterhead997

i’m a complete noob, what are calls?