Imagine having to train your LLM on tumblr content, really scraping the bottom of the barrel in legibility (nothing against all those weird tumblr communities with their weird slang and stuff, prob going to make it the LLMs just sprout nonsense as the LLM doesn't understand anything).
it still amazes me that nobody at OpenAI seemed to realize ChatGPT at release sounded exactly like a bottom-tier reddit poster, because of how much of Reddit’s corpus they had ingested. part of me can’t wait for gpt’s shitty neoliberal tumblr impression before they re-weight it to once again sound like the kind of English essay that gets “Apply Yourself!” written on it in red ink