PieFed Meta @piefed.social cabbage @piefed.social 9mo ago

User-sourced content moderation

Hi,

The CSAM scandal the other day got me thinking about the (often lacking) capability of the Threadiverse to deal with quickly with content moderation, and since PieFed has already been a bit experimental in this regard, I figured maybe this is a place where I could ask if an idea is feasible. Sorry if it's a bad match!

The idea is to identify trusted users, in the same way that PieFed currently identifies potentially problematic users. Long term users with significantly more upvotes than downvotes. These trusted users could get an additional option to report a post, beyond "Report to moderator": Something like "Mark as abuse".

The user would be informed that this is meant for content that clearly goes against the rules of the server, that any other type of issue should be reported to moderators, and that abuse of the function leads to revoke of privilege to use it and, if intentional, potentially a ban.

If the user accepts this and marks a post as abuse, every post by the OP of the marked post would be temporarily hidden on the instance and marked for review by a moderator. The moderator can then choose to either 1) ban the user posting abusive material, or 2) make the posts visible again, and remove the "trusted" flag of the reporting user and hence avoiding similar false positives in the future.

A problem I keep seeing on the threadiverse is that bad content tends to remain available too long, as many smaller instances means that the moderating team might simply all be asleep. So this seems like one possible way of mitigating that. Maybe it's not technically feasible, and maybe it's just not a particularly good idea; it might also not be a particularly original idea, I don't know. But I figured it might be worth discussing.

You're viewing a single thread.

4 comments

I like the idea but with a more gradual approach.

Instead of trusted and not trusted users (1/0, yes/no), something like levels (0 to 100). Users trust level can be growing with use and positive flags, and decrease with false positives or with time.

When a user mark a post as abuse, his/her trust level is added to the 'abuse level' of the post. As an example, when the post 'abuse level' reachs some threeshold, a warning to users can be shown, if level reachs higher, a moderator can be warned. If the level reachs even higher threeshold, post is hidden.

If a moderator reviews the post and find it is a positive mark, trust level of reporters is increased. If it's a false positive, trust level of reporters is decreased.

This can make that a single user with a false positive won't make the post hidden or tagged, unless is a user with a very high trust level. At the same time, a post can be marked as abuse if a few users with medium levels find it abusive.

Trust levels shouldn't be so easy to reach and I think users shouldn't know the exact level they are.

Just an idea :)
- I like this spin on it!
  
  I guess it would have to tie in with the existing report function - it doesn't make much sense to have users report something as abuse if they don't think it's worth warning a moderator over. At that point it sounds like a downvote should be enough.
  
  It could also be a challenge if it is taken too lightly - say someone posts something wildly controversial but within the boundaries of free speech, such a post about pineapple on pizza in a foodporn community. A large number of users might report it to moderators for more or less serious reasons, but it would be unfortunate if this caused the temporary removal of the post without moderator action.
  
  It should probably be established in the reporting procedure not only that the user is credible, but also that the user actually believes that it is necessary to remove the post as opposed to other moderator action.
  
  I wonder how this would work across instances, especially as what might be seen as abusive in one instance may not be in another. Also could this be subject to poisoning, ie spinning up an instance to inflate account reputation on another instance or to mass report abuse from users that instance claims “reputable.” Making it something that is configurable with per instance granularity, aside from being tedious, might lead to situations where only the big instance users get a say, smaller ones can become trashed by a user who manages to become the most reputable solely for coming from a big instance, and/or being able to spam up their rep across instances while mods are asleep possibly exploiting time zone differences (ie build fake rep from lemmy.world at night, to spam a European or Asian instance that is in daylight hours).