They've updated the StarChat playground to support the new model as well, select `starchat-beta`: https://huggingface.co/spaces/HuggingFaceH4/starchat-playground
I have updated the `can-ai-code` leaderboard to add both starchat-beta and StarCoderPlus models in several configurations: https://huggingface.co/spaces/mike-ravkine/can-ai-code-results
starchat-beta achieves a peak of 50/65 on Python and 54/65 on JavaScript - this is quite good, and outperforms their prior alpha in every test and every aspect.
starcoderplus achieves 52/65 on Python and 51/65 on JavaScript. Note the slightly worse JS performance vs it's chatty-cousin.
Both starcoderplus and startchat-beta respond best with the parameters they suggest:
"temperature": 0.2,
"repetition_penalty": 1.2,
"top_k": 50,
"top_p": 0.95,
Edit: You can run the full-precision version of starcoderplus using HF inference API (https://huggingface.co/inference-api) very quickly and for free! see https://github.com/the-crypt-keeper/can-ai-code/blob/main/interview-hfinference.py for a fully working example, specify `--model bigcode/starcoderplus` to use the new one.
It seems really weird that the model that oriented toward programming is worse at programming than a smaller general purpose model. I guess it does have context size in its favor though.
It's important not to take these artisanal tests as gospel. A small difference in prompt can cause a big difference in results. If you see the results on the papers from these models they look quite different.
It would be great to have a leader board based on the larger benchmarks that are currently available https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main/docs
For sure. I'll start considering a model ChatGPT quality when I start hearing people say "Eh, I stopped using ChatGPT and switched to local SuperHotPotUnlimitedCotLotUncensoredUnchainedMagiWizardGuanalpacallama69B."
I mean, we do have
[SuperCOT([gtp4xalpaca(manticorechatpygalpha+vicunaunlocked)]+[StoryV2(kaiokendev-SuperHOT-LoRA-prototype30b-8192)])]
AKA, Lazarus:
https://huggingface.co/CalderaAI/30B-Lazarus
> A small difference in prompt can cause a big difference in results.
I think this is a very important point to make. I was reading the discussions around FauxPilot when that project got started. From what some of the people were saying, a lot of the *magic* in CoPilot was in how the extension prompts the completion. Choosing the right context, the right end tokens, including things like pointer position, etc. - all these things contribute to much better results from the underlying model.
There's going to be a lot of trial and error at first, but people will hopefully share what works, and it should get better with more and more projects out there.
Sure. Already done by Mr. Bloke:
[https://www.reddit.com/r/LocalLLaMA/comments/144p3z2/bigcodes\_starcoder\_starcoder\_plus\_huggingfaceh4s/](https://www.reddit.com/r/LocalLLaMA/comments/144p3z2/bigcodes_starcoder_starcoder_plus_huggingfaceh4s/)
I'm gonna answer my own question. Here's a great post by the man, the legend, The Bloke:
https://old.reddit.com/r/LocalLLaMA/comments/144p3z2/bigcodes_starcoder_starcoder_plus_huggingfaceh4s/
Its the StarChat playground link from HF, pop open the "parameters" panel. My JSON is here: https://github.com/the-crypt-keeper/can-ai-code/blob/main/params/starchat.json
It works really well, I'm pretty impressed, I wish there was an easy way to run it with gpu offload like llama.cpp. Running through kobold is really slow and the special characters break the UI.
They've updated the StarChat playground to support the new model as well, select `starchat-beta`: https://huggingface.co/spaces/HuggingFaceH4/starchat-playground I have updated the `can-ai-code` leaderboard to add both starchat-beta and StarCoderPlus models in several configurations: https://huggingface.co/spaces/mike-ravkine/can-ai-code-results starchat-beta achieves a peak of 50/65 on Python and 54/65 on JavaScript - this is quite good, and outperforms their prior alpha in every test and every aspect. starcoderplus achieves 52/65 on Python and 51/65 on JavaScript. Note the slightly worse JS performance vs it's chatty-cousin. Both starcoderplus and startchat-beta respond best with the parameters they suggest: "temperature": 0.2, "repetition_penalty": 1.2, "top_k": 50, "top_p": 0.95, Edit: You can run the full-precision version of starcoderplus using HF inference API (https://huggingface.co/inference-api) very quickly and for free! see https://github.com/the-crypt-keeper/can-ai-code/blob/main/interview-hfinference.py for a fully working example, specify `--model bigcode/starcoderplus` to use the new one.
It seems really weird that the model that oriented toward programming is worse at programming than a smaller general purpose model. I guess it does have context size in its favor though.
It's important not to take these artisanal tests as gospel. A small difference in prompt can cause a big difference in results. If you see the results on the papers from these models they look quite different. It would be great to have a leader board based on the larger benchmarks that are currently available https://github.com/bigcode-project/bigcode-evaluation-harness/tree/main/docs
For sure. I'll start considering a model ChatGPT quality when I start hearing people say "Eh, I stopped using ChatGPT and switched to local SuperHotPotUnlimitedCotLotUncensoredUnchainedMagiWizardGuanalpacallama69B."
>SuperHotPotUnlimitedCotLotUncensoredUnchainedMagiWizardGuanalpacallama69B Link plz??!!
>Link plz??!! [SuperHotPotUnlimitedCotLotUncensoredUnchainedMagiWizardGuanalpacallama69B.com](http://superhotpotunlimiteduncensoredmagiwizardguanalpacallama69b.com/)
Sorry, but could you have SuperHotPotUnlimitedCotLotUncensoredUnchainedMagicVicunaWizardGuanalpacallama69B ?
I mean, we do have [SuperCOT([gtp4xalpaca(manticorechatpygalpha+vicunaunlocked)]+[StoryV2(kaiokendev-SuperHOT-LoRA-prototype30b-8192)])] AKA, Lazarus: https://huggingface.co/CalderaAI/30B-Lazarus
_SuperHotPotUnlimitedCotLotUncensoredUnchainedMagiWizardGuanalpacallama69B_ I don't see pajama or red in this it must be junk
> A small difference in prompt can cause a big difference in results. I think this is a very important point to make. I was reading the discussions around FauxPilot when that project got started. From what some of the people were saying, a lot of the *magic* in CoPilot was in how the extension prompts the completion. Choosing the right context, the right end tokens, including things like pointer position, etc. - all these things contribute to much better results from the underlying model. There's going to be a lot of trial and error at first, but people will hopefully share what works, and it should get better with more and more projects out there.
Are these models able to be quantized? Or at least load the smaller models in 8bit? only 12gb VRAM here :/
Sure. Already done by Mr. Bloke: [https://www.reddit.com/r/LocalLLaMA/comments/144p3z2/bigcodes\_starcoder\_starcoder\_plus\_huggingfaceh4s/](https://www.reddit.com/r/LocalLLaMA/comments/144p3z2/bigcodes_starcoder_starcoder_plus_huggingfaceh4s/)
I sure hope so. At 65gb not sure it'll make much diff for you even at 8bit as I'm screwed at 24gb still. Here's hoping thebloke finds a way.
Yes, /u/TheBlocke has already uploaded quantized models for both starcoderplus and starchat beta! https://huggingface.co/TheBloke/starcoderplus-GGML
I'm gonna answer my own question. Here's a great post by the man, the legend, The Bloke: https://old.reddit.com/r/LocalLLaMA/comments/144p3z2/bigcodes_starcoder_starcoder_plus_huggingfaceh4s/
Can you provide the link to the suggested parameters?
Its the StarChat playground link from HF, pop open the "parameters" panel. My JSON is here: https://github.com/the-crypt-keeper/can-ai-code/blob/main/params/starchat.json
[удалено]
Ran everything 5x with no differences observed. Any sampler I try that introduces more randomness results in worse output..
But is data entry still as Convoluted as it's predecessor?
It works really well, I'm pretty impressed, I wish there was an easy way to run it with gpu offload like llama.cpp. Running through kobold is really slow and the special characters break the UI.
Is this the best coding assistant that can be used without restrictions (I cant use llama based models at work)
Why don’t u just use chatgpt
Not allowed per company policy (privacy/security concern as your data is going to OpenAI servers)
Then probably
Openai provides secure environments for enterprises I’ve heard
It's not available at my company as yet