hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

9.4K
active users

Over the years, I made a handful of maps of various things in Cambridge; I have collected some, but not all of them, on this page about housing things in Cambridge.

This includes things like maps of where you could legally build a fourplex (short answer: not many places!); the distribution of tax paid per parcel (Kendall Square pays a lot!) and more.

crschmidt.net/housing/cambridg

crschmidt.netHousing Explorations in CambridgeHousing-related explorations in Cambridge.

Fun fact: sharing this link on Mastodon caused my server to serve 112,772,802 bytes of data, in 430 requests, over the 60 seconds after I posted it (>7 r/s). Not because humans wanted them, but because of the LinkFetchWorker, which kicks off 1-60 seconds after Mastodon indexes a post (and possibly before it's ever seen by a human).

Every Mastodon instance fetches and stores their own local copy of my 750kb preview image.

(I was inspired by to look by @jwz's post: mastodon.social/@jwz/109411593.)

Mastodonjwz (@jwz@mastodon.social)Mastodon stampede. "Federation" now apparently means "DDoS yourself." Every time I do a new blog post, within a second I have over a thousand simultaneous hits of that URL on my web server from unique IPs. Load goes over 100, and mariadb stops... https://jwz.org/b/yj6w

@crschmidt well, this sounds like a p0 bug. Mastodon is going into robots.txt on many servers once this gets noticed widely.

@cshabsin Don't worry! I just confirmed that Mastodon doesn't respect robots.txt for any of these fetches, so even if it's added to robots.txt, it will have no effect!

@crschmidt @cshabsin that definitely seems ... inappropriate.

@tw @crschmidt @cshabsin link preview bots all ignore robots.txt, so mastodon is at least following precedent here.

Except that I think Mastodon's implementation is wrong: on a centralized network the preview is created at the 'request' of the person sharing, so robots.txt doesn't apply. But here it's created fully automatically, so it really should apply. The fix would be to capture the site at sharing time and send it along in the post, which is also more efficient (though prone to abuse?)

@jefftk @tw @cshabsin yeah, the prone to abuse and "hard to standardize across all implementations" are the reasons it was rejected in 2017, and has languished as an untouched feature request since 2020 (respectively). Time to rethink that. (I don't love that a single implementation is 95% of the fediverse, but it is; standardization is frankly secondary to making sure the core implementation works well.

@gme @crschmidt @cshabsin @jefftk That's a pretty dismissive take on software violating an agreed-upon Internet standard...

I read the blog post and at the very top OP even admits that Mastodon is not a crawler. So what "standard" is being broken?

@gme @crschmidt @tw @cshabsin where do you see that in the blog post? I agree that scraping a preview isn't crawling if you do it at send time, but doing it at automatically at retrieve time is

haliphax 👾

@gme @crschmidt @tw @cshabsin @jefftk
Lack of alt text in a text only image... 😭 I think many clients even offer to OCR it for you. This small thing goes a long way to making this place more open and accessible! 🙏