It’s not so much about English as it is about writing patterns. Like others said, it has a “stilted college essay prompt” feel because that’s what instruct-finetuned LLMs are trained to do.
Another quirk of LLMs is that they overuse specific phrases, which stems from technical issues (training on their output, training on other LLM’s output, training on human SEO junk, artifacts of whole-word tokenization, inheriting style from its own previous output as it writes the prompt, just to start).
“Slop” is an overused term, but this is precisely what people in the LLM tinkerer/self hosting community mean by it. It’s also what the “temperature” setting you may see in some UIs is supposed to combat, though that crude an ineffective if you ask me.
Anyway, if you stare at these LLMs long enough, you learn to see a lot of individual model’s signatures. Some of it is… hard to convey in words. But “Embodies” “landmark achievement” and such just set off alarm bells in my head, specifically for ChatGPT/Claude. If you ask an LLM to write a story, “shivers down the spine” is another phrase so common its a meme, as are specific names they tend to choose for characters.
If you ask an LLM to write in your native language, you’d run into similar issues, though the translation should soften them some. Hence when I use Chinese open weights models, I get them to “think” in Chinese and answer in English, and get a MUCH better result.
All this is quantifiable, by the way. Check out EQBench’s slop profiles for individual models:
It’s not so much about English as it is about writing patterns. Like others said, it has a “stilted college essay prompt” feel because that’s what instruct-finetuned LLMs are trained to do.
Another quirk of LLMs is that they overuse specific phrases, which stems from technical issues (training on their output, training on other LLM’s output, training on human SEO junk, artifacts of whole-word tokenization, inheriting style from its own previous output as it writes the prompt, just to start).
“Slop” is an overused term, but this is precisely what people in the LLM tinkerer/self hosting community mean by it. It’s also what the “temperature” setting you may see in some UIs is supposed to combat, though that crude an ineffective if you ask me.
Anyway, if you stare at these LLMs long enough, you learn to see a lot of individual model’s signatures. Some of it is… hard to convey in words. But “Embodies” “landmark achievement” and such just set off alarm bells in my head, specifically for ChatGPT/Claude. If you ask an LLM to write a story, “shivers down the spine” is another phrase so common its a meme, as are specific names they tend to choose for characters.
If you ask an LLM to write in your native language, you’d run into similar issues, though the translation should soften them some. Hence when I use Chinese open weights models, I get them to “think” in Chinese and answer in English, and get a MUCH better result.
All this is quantifiable, by the way. Check out EQBench’s slop profiles for individual models:
https://eqbench.com/creative_writing_longform.html
https://eqbench.com/creative_writing.html
And it’s best guess at inbreeding “family trees” for models:
Wow, thank you for such an elaborate answer!
By the easy, how do you make models “think” in Chinese? By explicitly asking them to? Or by writing the prompt in Chinese?