Stack Overflow bans users en masse for rebelling against OpenAI partnership — users banned for deleting answers to prevent them being used to train ChatGPT
That's trivial to filter if you just look at how much time has passed between posting and editing. Reddit comments are only very rarely updated after more than a day.
sure, but the more you fuck with the data, the more it requires curating, the less valuable it becomes. I'm not entirely sure places like reddit even retain full edit history for posts over a year old.