In what is hopefully my last child safety report for a while: a report on how our previous reports on CSAM issues intersect with the Fediverse.
https://cyber.fsi.stanford.edu/io/news/addressing-child-exploitation-federated-social-media
Similar to how we analyzed Twitter in our self-generated CSAM report, we did a brief analysis of public timelines of prominent servers, processing media with PhotoDNA and SafeSearch. The results were legitimately jaw-dropping: our first pDNA alerts started rolling in within minutes. The true scale of the problem is much larger, as inferred by cross-referencing CSAM-related hashtags with SafeSearch level 5 nudity matches.
Hits were primarily on a not-to-be-named Japanese instance, but a secondary test to see how far they propagated did show them getting federated to other servers. A number of matches were also detected in posts originating from the big mainstream servers. Some of the posts that triggered matches were removed eventually, but the origin servers did not seem to consistently send "delete" events when that happened, which I hope doesn't mean the other servers just continued to store it.
The Japanese server problem is often thought to mean "lolicon" or CG-CSAM, but it appears that servers that allow computer-generated imagery of kids also attracts users posting and trading "IRL" materials (their words, clear from post and match metadata), as well as grooming and swapping of CSAM chat group identifiers. This is not altogether surprising, but it is another knock against the excuses of lolicon apologists.
Traditionally the solution here has been to defederate from freezepeach servers and...well, all of Japan. This is commonly framed as a feature and not a bug, but it's a blunt instrument and it allows the damage to continue. With the right tooling, it might be possible to get the large Japanese servers to at least crack down on material that's illegal there (which non-generated/illustrated CSAM is).
I have argued for a while that the Fediverse is way behind in this area; part of this lack of tooling and reliance on user reports, but part is architectural. CSAM-scanning systems work one of two ways: hosted like PhotoDNA, or privately distributed hash databases. The former is a problem because all servers hitting PhotoDNA at once for the same images doesn't scale. The latter is a problem because widely distributed hash databases allow for crafting evasions or collisions.
I think for this particular issue to be resolved, a couple things need to happen: one, an ActivityPub implementation of content scanning attestation should be developed, allowing the origin servers to perform scanning via a remote service and other servers to verify it happened. Second, for the hash databases that are privately distributed (e.g. Take It Down, NCMEC's NCII database), someone should probably take on making these into a hosted service.
@det@hachyderm.io Cloudflare has an automated CSAM scanning tool that works in conjunction with NCMEC. All uploads are automatically scanned for CSAM and the service is free.
@gme Something else that would be nice to see is cloud storage providers to provide scanning for things uploaded to their buckets.
I do think that reporting ultimately needs to be at the server level though, because you'll typically send IP and user data with those reports.
@det@hachyderm.io Cloudflare already does that for its R2 cloud storage. ;-) All of my instances use Cloudflare R2 and they all have CSAM detection enabled. A report is automatically generated to the NCMEC on detection and I get an Email so I can delete the content/ ban the user.
@gme Thank you for the info and for putting in the effort to do that!
@by_caballero@mastodon.social @det@hachyderm.io I'm not using Wildebeest I never could get it working. I'm using Firefish.
@by_caballero@mastodon.social @det@hachyderm.io I used the Ubuntu installer script. (22.04).
Had to manually change all references of calckey to firefish when prompted.
Had to manually use the correct repo: https://git.joinfirefish.org/firefish/firefish.git
.
I use Cloudflare so I said yes when prompted for cloudflare credentials and Global API key.
I have my postgres database on a separate server that uses an RFC-1918 network so I pre-setup the database user, and created the database ahead of time. Also had to make sure to enable postgres to listen on the appropriate interface in postgres.conf and setup security properly in postgres_hba.conf.
And I installed Redis locally.
And then it pretty much worked and installed.
Then it's a matter of setting up the bucket in Cloudflare, attaching the bucket to one of your subdomains (I used cdn.bofh.social
f/ex) and generating an API key.
Base URL is your subdomain. https://cdn.bofh.social
Bucket is the name of your bucket. bofh-social
for me.
Prefix is the name of your bucket. bofh-social
for me.
Endpoint is the endpoint that Cloudflare gives you, but without anything after the fqdn! https://--random string of letters and numbers--.r2.cloudflarestorage.com
is what I use.
Region is auto
.
You'll need to obtain your Access Key and Secret Key from Cloudflare for the bucket.
Use SSL is checked.
Set "public-read" is checked.
S3ForcePathStyle is checked.
And one last thing...
You need to go into the bucket settings in Cloudflare and DELETE the object lifecycle rule. By default Cloudflare inserts a rule into all buckets to delete objects after 7 days!
But that's about it.
@by_caballero@mastodon.social @det@hachyderm.io Knock yourself out. The Ubuntu script makes it super easy. Just run it as root and let it do its thing. :-) Good luck!