BestERP: Find the best AI character and role-playing apps.

Introduction

We introduce SOLAR-10.7B, an advanced large language model (LLM) with 10.7 billion parameters, demonstrating superior performance in various natural language processing (NLP) tasks. It's compact, yet remarkably powerful, and demonstrates unparalleled state-of-the-art performance in models with parameters under 30B.

We present a methodology for scaling LLMs called depth up-scaling (DUS) , which encompasses architectural modifications and continued pretraining. In other words, we integrated Mistral 7B weights into the upscaled layers, and finally, continued pre-training for the entire model.

SOLAR-10.7B has remarkable performance. It outperforms models with up to 30B parameters, even surpassing the recent Mixtral 8X7B model. For detailed information, please refer to the experimental table. Solar 10.7B is an ideal choice for fine-tuning. SOLAR-10.7B offers robustness and adaptability for your fine-tuning needs. Our simple instruction fine-tuning using the SOLAR-10.7B pre-trained model yields significant performance improvements.

For full details of this model please read our paper.

Although I just discovered this model, there’s some truth here to what it advertises. I wouldn’t call it the smartest, cleverest, or even the most well rounded model, but what it seems to do it does well. Please note there seems to be some censorship in this model, but as I will briefly describe, fine tuning will make this a hit among this particular parameter size IMO. The model seems to adhere to character cards extremely well, and although it has a fairly low context, what is in that context is strongly portrayed. I will often use long form instructions to guide the AI to particular outcomes in how it interacts, using IF THEN type interactions, and it has no problem doing so. Further, compared to models of similar size, it feels like the model understands my cards a lot better than anything else out there. It can be a bit terse, and a use a bit too much purple prose, but it’s starting to feel like the gap has been bridged where smaller models feel larger.

Overall, this feels like a nice leap forward for RP, with great adherence to cards, appropriate responses, less hallucinations, and appropriate responses with far less rerolls. Fine tuning can eek this out further, give it more personality and tighter prose. Prompting also seems to play a massive role here, and the model is quite compliant even in the current form. Highly recommended you give it a try, and check out the models being built from it.

SOLAR-10.7B-Instruct-v1.0

Submit your review

Reviews