You are polluting the data set. Do it a few times with different text sources and the scrubbers won't know what part of your comment history is good. Replace, don't delete.
I'm pretty sure they'll know that the first version of each comment is almost certainly the good one. People sometimes edit a comment to add new information or fix a typo, but they almost never replace nonsense with a good comment, rather than the other way around.
Edit: fixed typos, also replaced excerpt from Moby Dick with this post.
Edit 2: the comments you post here are totally available for machine learning, so I don't see much of a point in deleting my Reddit comments as long as I'm participating in Lemmy.
Not in a meaningful way. It’s easy to detect and revert a change like this. Instead of bulk changing all your comments, you should slowly change them over time.
Even then, users don’t usually edit most of their comments. Sure Reddit might be naive and just take the current comments, but it’s pretty trivial to reverse this kind of thing.
Probably good to do it to make this process harder and more error prone for Reddit but I would not be under the impression that this has an impact beyond being annoying.
are there copyrighted texts that have such distinctive patterns that they would be particularly easy to spot in an LLM's output? say, would replacing every comment with a page from moby dick or wuthering heights be more or less infringing than using harry potter? hypothetically.
Well, I'm pretty sure Moby Dick is in the public domain by now. If I were you I'd go for something from Disney which is mathematically certain to get somebody sued although I can't predict who.
i personally think the value of the comments are worth leaving for people to find later even if Reddit does use them in an underhanded way.
i recognize this may not be popular.
Yup. Reddit gives zero fucks about any form or protest or the degredation of the quality of content. They already have the metric the traffic originally created.
The only people negatively impacted are the people trying to find information that are pushed there by search engines when trying to find stuff.
As someone else brilliantly pointed out, leaving the comments hurts Reddit more than delete/edit.
Deleting/editing comments only hides the posts from the public. Reddit has the original posts, is ignoring all edits made to posts, and selling that original data.
I got banned from /r/AFL because I used Redact to scrub my comments. My how the turntables have turned and they turned out to be the real thin skinned pansy cunts. Not /r/sports during the kerfuffle.