BestERP: Find the best AI character and role-playing apps.

5.0

quant: gguf q5

This is a really good one. Better than MythoMax and my go-to as a well-rounded model for complicated scenarios (and still much faster than a heavier model).

4.5

quant: gguf q5

In my experience, way better than any 13B I used in general knowledge and also feels somewhat natural with emotions of characters but still runnable at decent speed.

5.0

quant: other

Used a custom 4.15bpw exl2_2 quant. By far the best roleplay model I've seen so far with fast generation. (Some 120Bs stay in character a bit better but honestly this one does as well if not better in just about every other regard while running at 34 tokens / second instead of 0.5)

2.0

full precision

I'd rather role-play with tinyllama than this piece of shit.

5.0

quant: gguf q8

tldr: model=good, medfet=great, storywriting=good, me=high, many_cheaper_gpus=awesome

huh, I wouldn't have left a comment if this required registering, good decision there imo site owners. Also, saw something about asking to use reviews on huggingface I think, feel free to use this one if you want to.

I'm checking out llama3 tomorrow to kinda "date" this review in context of what is out there. For what I've been trying to do, this is pretty much the best I've come up with after trying dozens of 13b/20b/70b models. I kinda have to re-roll or tweak a word or two fairly often, but it is one of the only models that has delivered gold very regularly.

My "use case" has mostly been writing some medical fetish story stuff. But uh, I guess I don't see the same appeal that a lot of people are looking for, the "role play" stuff. I mean, you do you if that is what you like, I just can't speak to that usage. I mostly describe the story I want it to write, set max_new_tokens to 4096 and always ban eos_token. Then use the Notebook feature in oobabooga, so I can stop it to fix the story a bit or reroll.

So, if you are looking to write some medfet stories, this is probably the exact model you want. From the current models that I've tried, and it has been dozens, this one is absolutely in first place. I really want to like nous-capybara-limarpv3-34b.Q8_0 because of the way better context length, or some of the many 70b models I can run (super fast, seriously, just build a machine you can put enough RTX 4060ti 16gb cards in to fit the whole model in memory and/or throw them in eGPU enclosures via thunderbolt OR NVMe slots, seriously, you can run video cards through those, crazy. Check out the eGPU website, quite good info)

I'm guessing a Llama-70b based. Model may replace this one for me, but if has beaten anything before those came out. Also, WOW at the level of artistic tinkering that the recipe of this model implies by the creator. Guess the next task before my high self goes to sleep is to toss a couple bucks their way.

PsyMedRP v1 20B

Quantized Versions

Submit your review

Reviews