In what is hopefully my last child safety report for a while: a report on how our previous reports on CSAM issues intersect with the Fediverse.
https://cyber.fsi.stanford.edu/io/news/addressing-child-exploitation-federated-social-media
Similar to how we analyzed Twitter in our self-generated CSAM report, we did a brief analysis of public timelines of prominent servers, processing media with PhotoDNA and SafeSearch. The results were legitimately jaw-dropping: our first pDNA alerts started rolling in within minutes. The true scale of the problem is much larger, as inferred by cross-referencing CSAM-related hashtags with SafeSearch level 5 nudity matches.
Hits were primarily on a not-to-be-named Japanese instance, but a secondary test to see how far they propagated did show them getting federated to other servers. A number of matches were also detected in posts originating from the big mainstream servers. Some of the posts that triggered matches were removed eventually, but the origin servers did not seem to consistently send "delete" events when that happened, which I hope doesn't mean the other servers just continued to store it.
The Japanese server problem is often thought to mean "lolicon" or CG-CSAM, but it appears that servers that allow computer-generated imagery of kids also attracts users posting and trading "IRL" materials (their words, clear from post and match metadata), as well as grooming and swapping of CSAM chat group identifiers. This is not altogether surprising, but it is another knock against the excuses of lolicon apologists.
Traditionally the solution here has been to defederate from freezepeach servers and...well, all of Japan. This is commonly framed as a feature and not a bug, but it's a blunt instrument and it allows the damage to continue. With the right tooling, it might be possible to get the large Japanese servers to at least crack down on material that's illegal there (which non-generated/illustrated CSAM is).
@det Did you study how defederated those problem servers were? I get that this still allows the damage to continue on those, but would be good to know how much or little the content spread or propagation occurred to the larger Fediverse.
@tchambers Briefly, but not thoroughly enough to include in this report. It appears Japanese instances are fairly isolated, but around 5% of detected instances were from outside of the Japanese/free speech block. Also small servers can of course slip through pretty easily.
@det @tchambers It would be very useful to get the statistics for what percentage of this content is blocked by a well-known "minimal" blocklist like Oliphant's unified tier 0.
Can you perhaps filter your dataset with the list of servers here: https://codeberg.org/oliphant/blocklists/src/branch/main/blocklists/_unified_tier0_blocklist.csv and tell us how that modifies the statistics?
@ocdtrekkie @tchambers This is a very good question, i.e. measuring how porous various blocklists are. I'm hoping to do this with a larger sample of more servers.
@det @tchambers I think it's a really important number even against the data you have, because there is an open PR to add recommending a blocklist right into the Mastodon setup process already.
If that handles over 95% of the problem out of the box... that's a good case for that PR and we're not nearly in that bad a position as a network.
(Hashtag-based automoderating also seems like low-lying fruit to me too though.)
@ocdtrekkie @det @tchambers for hashtag whackamole we'd need some relatively fast-paced intelligence to act on, I'd like to see a trusted flagger type entity in this space that can get this sort of knowledge to participating service providers for action.
@jaz @det @tchambers Indeed, though the hashtags can only change as fast as the community looking for the content knows to look for the new tags.
And even if the end result of this is CSAM peddlers dropping the use of hashtags altogether, it would crater discoverability and reach.
@ocdtrekkie @tchambers Will try to get to that shortly.
One thing I will say though is that it depends on what you consider "the network" with this kind of harm. My goal would be to have tooling easy enough to use that even the Japanese cluster uses it.
@det @tchambers Do you feel that they would want to? I suppose the question is are you envisioning blocking "real" CSAM only, or simulated and drawn content which they are unlikely to ever want to filter?
@ocdtrekkie @tchambers I think that as long as drawn / obviously generated content is not commingled with the other content (it varies by hash db) they might be open to it. They're already moderating some portion of it, just very slowly and badly.
That said, the photorealism of generated content is only getting harder to distinguish and can feature real people, so ultimately there needs to be at least some degree of policy change at the government level.
@ocdtrekkie @tchambers That list would have blocked 87% of hits in our dataset.
@det @tchambers Interesting! Based on your about 5% comment, I was suspecting a higher percentage would've been covered by tier 0. That's pretty intriguing, and very useful.
@ocdtrekkie @tchambers My apologies, I was estimating the 5%, but when I got all the match URLs (they're intentionally burdensome to access) it was a bit higher.
The primary source after the large Japanese server is the flagship — which is not great for a flagship, so I hope this bumps up the engineering priority list somewhat.
@det @ocdtrekkie @tchambers This is why many blocklists (including thebad.space) include mastodon.social. And why I am utterly opposed to considering that wretched hive to be any sort of “flagship.”
@det @ocdtrekkie @tchambers Speaking of, I would love to see what percentage would have been blocked by thebad.space block list!