In what is hopefully my last child safety report for a while: a report on how our previous reports on CSAM issues intersect with the Fediverse.
https://cyber.fsi.stanford.edu/io/news/addressing-child-exploitation-federated-social-media
Similar to how we analyzed Twitter in our self-generated CSAM report, we did a brief analysis of public timelines of prominent servers, processing media with PhotoDNA and SafeSearch. The results were legitimately jaw-dropping: our first pDNA alerts started rolling in within minutes. The true scale of the problem is much larger, as inferred by cross-referencing CSAM-related hashtags with SafeSearch level 5 nudity matches.
Hits were primarily on a not-to-be-named Japanese instance, but a secondary test to see how far they propagated did show them getting federated to other servers. A number of matches were also detected in posts originating from the big mainstream servers. Some of the posts that triggered matches were removed eventually, but the origin servers did not seem to consistently send "delete" events when that happened, which I hope doesn't mean the other servers just continued to store it.
The Japanese server problem is often thought to mean "lolicon" or CG-CSAM, but it appears that servers that allow computer-generated imagery of kids also attracts users posting and trading "IRL" materials (their words, clear from post and match metadata), as well as grooming and swapping of CSAM chat group identifiers. This is not altogether surprising, but it is another knock against the excuses of lolicon apologists.
Traditionally the solution here has been to defederate from freezepeach servers and...well, all of Japan. This is commonly framed as a feature and not a bug, but it's a blunt instrument and it allows the damage to continue. With the right tooling, it might be possible to get the large Japanese servers to at least crack down on material that's illegal there (which non-generated/illustrated CSAM is).
I have argued for a while that the Fediverse is way behind in this area; part of this lack of tooling and reliance on user reports, but part is architectural. CSAM-scanning systems work one of two ways: hosted like PhotoDNA, or privately distributed hash databases. The former is a problem because all servers hitting PhotoDNA at once for the same images doesn't scale. The latter is a problem because widely distributed hash databases allow for crafting evasions or collisions.
I think for this particular issue to be resolved, a couple things need to happen: one, an ActivityPub implementation of content scanning attestation should be developed, allowing the origin servers to perform scanning via a remote service and other servers to verify it happened. Second, for the hash databases that are privately distributed (e.g. Take It Down, NCMEC's NCII database), someone should probably take on making these into a hosted service.
There are some other things that would be helpful in controlling proliferation: for example, easy UI for admins to do hashtag and keyword blocks, instead of relying on users to track a changing threat landscape. These could be distributed or subscription-based across servers, though how public those lists should be is up for debate. That subscription model could also be used for general "fediblock" lists.
Integrated reporting to NCMEC's CyberTipline would make life easier for admins and increase the likelihood that those reports get filed at all. Even without attestation, the big instances should all be using PhotoDNA; it's unclear if anyone on the Fediverse is even doing this, given that they'd have to manually hack it in. UI needs to be added to mainline Mastodon to allow for that—it's a very simple pair of REST calls that just need a couple auth tokens.
A pluggable system for content scanning and classifiers would also be useful. Right now Mastodon has webhooks, which aren't really a great match IMO. Something closer to Pleroma's MRF could be a starting point. Lastly, there's room for better tools for moderation: more specific child safety flows, escalation capabilities, and trauma prevention tools (e.g. default blurring of images in all user reports of CSAM or gore).
By the way, now that we have big players like Meta entering the Fediverse, it would be great if they could sponsor some development on child safety tooling for Mastodon and other large ActivityPub implementations, as well as work with an outside organization to make a hosted hash database clearinghouse for the Fediverse. It would be quite cheap for them, and would make the ecosystem as a whole a lot nicer. /thread
@det I was talking about this in another thread: there really needs to be a pluggable "fediblock" API that'll allow server admins to automatically block bad instances and propagate those blocks to whoever is listening. Pluggable (as in, with an ability to put in your own instance, so to speak) due to the fedi's distrust of centralized services. Furthermore, a content scanning API will be nice as well.
@me For the former, there are external tools that do exactly this, but they take some effort to set up and a lot of admins (particularly new ones) won't know to implement them. Having UI around it in the base install would be helpful.
@det is PhotoDNA AGPL? If not, there is zero reason to trust it does what it says it does, and isn't also searching for politically dangerous content, and alerting authorities in those countries, many of which are openly hostile to individual liberties.
@det I wrote about my vision on these topic, as a (new) Mastodon team member: https://renchap.com/blog/post/evolving_mastodon_trust_and_safety/
The big issue is: we dont have enough resources :( There is only 1 full-time dev on the project, I am working far too many hours per week as a volunteer, Eugen has many other duties, and we are struggling to keep up with maintenance work, so big features like this, that we want, are very hard to develop.
Hopefuly we will be able to get more money, but for now, this is very hard.
@det I would be happy to discuss this further with you if you want :)
@renchap An unenviable position for sure. I've wanted to have some of our research assistants write T&S tooling for Mastodon, but since they need to learn Ruby/RoR first and they only have limited hours, it's not come to fruition as of yet.
We do have experience integrating with pDNA/NCMEC/etc though, so I will drop you a line and see if we can be of any help.
Someone should really diff the blocklist between mastodon.social and mastodon.online :-(.
I just found out two domains were missing from the latter. Which are sadly related to this discussion.
I reported them all as such, but it's a demoralizing oversight.
Also a third related domain that wasn't on either, but it was on the Oliphant/Seirdy blocklists. Sucks that we don't seem to have robust co-operation here.
@sourcejedi_ben @det thanks, relayed to our mod team
@det i was, at one point in time, writing an MRF policy which integrated with the PhotoDNA REST API (and the hashes). i got tired of playing in the same sandbox as alex gleason though.
@ariadne That jives with my perception of the Pleroma ecosystem, yeah — nice tech and terrible servers
@det a lot of self-inflicted wounds, yes
I would never trust Meta to create or maintain the tooling for something as important and necessary as policing CSAM. It's an appalling shame that the ActivityPub specification did not account for moderation tools or CSAM blockers, but Meta would never give you those tools for free. They would rather use it as leverage to bend the entire ActivityPub spec to their whim, playing out the "Extend" phase of Embrace, Extend, Extinguish.
It's a Faustian bargain. Those tools need to be developed, absolutely, but by the open source community, not a profit driven amoral company.
@det I don't think Meta can be trusted to do anything positive...
@det
Meta do seem to have released a couple of open source fingerprinting projects:
• Hasher-Matcher-Actioner (HMA) around terror/mass-shooter content: https://about.fb.com/news/2022/12/meta-launches-new-content-moderation-tool/
• PDQ/TMK for photo-hasing/similarity-checking https://github.com/facebook/ThreatExchange/tree/main/pdq • https://about.fb.com/news/2019/08/open-source-photo-video-matching/
I've no idea if the projects are used outside of Meta or are well-regarded?
I'd be interested to see @mozilla active here too. Perhaps with a new fediverse-safety focus to their (closed) MOSS funding https://www.mozilla.org/en-US/moss/.
@nicol @mozilla PDQ is widely used outside of Meta, TMK and vPDQ less so but still in use. I have yet to take HMA for a spin.
Even apart from PhotoDNA, having something that feeds all actioned CSAM (or other noxious content) into a PDQ/vPDQ db that can be shared with trusted instances would be useful.
@det@hachyderm.io Cloudflare has an automated CSAM scanning tool that works in conjunction with NCMEC. All uploads are automatically scanned for CSAM and the service is free.
@gme Something else that would be nice to see is cloud storage providers to provide scanning for things uploaded to their buckets.
I do think that reporting ultimately needs to be at the server level though, because you'll typically send IP and user data with those reports.
@det@hachyderm.io Cloudflare already does that for its R2 cloud storage. ;-) All of my instances use Cloudflare R2 and they all have CSAM detection enabled. A report is automatically generated to the NCMEC on detection and I get an Email so I can delete the content/ ban the user.
@gme Thank you for the info and for putting in the effort to do that!
@det those hash DBs and accompany scanning software should be licensed under the AGPL, and made secure against attacks to prevent evasion.
@det Collisions should NEVER happen. If they can happen it is because the hash function in use is weak. SHA-256, SHA-384, SHA-512, Blake2b, Blake2s, and SHA-3 are all currently invulnerable to collision attacks. A hash database that uses MD5 or SHA-1 should not exist in 2023.
@det Did you study how defederated those problem servers were? I get that this still allows the damage to continue on those, but would be good to know how much or little the content spread or propagation occurred to the larger Fediverse.
@tchambers Briefly, but not thoroughly enough to include in this report. It appears Japanese instances are fairly isolated, but around 5% of detected instances were from outside of the Japanese/free speech block. Also small servers can of course slip through pretty easily.