Hachyderm @hachyderm

Recent searches

Search options

Only available when logged in.

**Christopher Schmidt** @crschmidt@better.boston · Nov 26, 2022

Nov 26, 2022

Christopher Schmidt @crschmidt@better.boston

Over the years, I made a handful of maps of various things in Cambridge; I have collected some, but not all of them, on this page about housing things in Cambridge.

This includes things like maps of where you could legally build a fourplex (short answer: not many places!); the distribution of tax paid per parcel (Kendall Square pays a lot!) and more.

https://crschmidt.net/housing/cambridge/

crschmidt.netHousing Explorations in CambridgeHousing-related explorations in Cambridge.

#Projects #CambMA #Cambridge

**Christopher Schmidt** @crschmidt@better.boston · Nov 26, 2022

Nov 26, 2022

Christopher Schmidt @crschmidt@better.boston

Fun fact: sharing this link on Mastodon caused my server to serve 112,772,802 bytes of data, in 430 requests, over the 60 seconds after I posted it (>7 r/s). Not because humans wanted them, but because of the LinkFetchWorker, which kicks off 1-60 seconds after Mastodon indexes a post (and possibly before it's ever seen by a human).

Every Mastodon instance fetches and stores their own local copy of my 750kb preview image.

(I was inspired by to look by @jwz's post: https://mastodon.social/@jwz/109411593248255294.)

Mastodonjwz (@jwz@mastodon.social)Mastodon stampede. "Federation" now apparently means "DDoS yourself." Every time I do a new blog post, within a second I have over a thousand simultaneous hits of that URL on my web server from unique IPs. Load goes over 100, and mariadb stops... https://jwz.org/b/yj6w

**Chris Shabsin** @cshabsin · Nov 26, 2022

Nov 26, 2022

Chris Shabsin @cshabsin

@crschmidt well, this sounds like a p0 bug. Mastodon is going into robots.txt on many servers once this gets noticed widely.

**Christopher Schmidt** @crschmidt@better.boston · Nov 26, 2022

Nov 26, 2022

Christopher Schmidt @crschmidt@better.boston

@cshabsin Don't worry! I just confirmed that Mastodon doesn't respect robots.txt for any of these fetches, so even if it's added to robots.txt, it will have no effect!

**Tim W** @tw@cantos.social · Nov 26, 2022

Nov 26, 2022

Tim W @tw@cantos.social

@crschmidt @cshabsin that definitely seems ... inappropriate.

**Jeff Kaufman** @jefftk@mastodon.mit.edu · Nov 27, 2022

Nov 27, 2022

Jeff Kaufman @jefftk@mastodon.mit.edu

@tw @crschmidt @cshabsin link preview bots all ignore robots.txt, so mastodon is at least following precedent here.

Except that I think Mastodon's implementation is wrong: on a centralized network the preview is created at the 'request' of the person sharing, so robots.txt doesn't apply. But here it's created fully automatically, so it really should apply. The fix would be to capture the site at sharing time and send it along in the post, which is also more efficient (though prone to abuse?)

**Christopher Schmidt** @crschmidt@better.boston · Nov 27, 2022

Nov 27, 2022

Christopher Schmidt @crschmidt@better.boston

@jefftk @tw @cshabsin yeah, the prone to abuse and "hard to standardize across all implementations" are the reasons it was rejected in 2017, and has languished as an untouched feature request since 2020 (respectively). Time to rethink that. (I don't love that a single implementation is 95% of the fediverse, but it is; standardization is frankly secondary to making sure the core implementation works well.

**Jeff Kaufman** @jefftk@mastodon.mit.edu · Nov 27, 2022

Nov 27, 2022

Jeff Kaufman @jefftk@mastodon.mit.edu

@crschmidt @tw @cshabsin Follow-up https://mastodon.mit.edu/@jefftk/109416209502343043 https://www.jefftk.com/p/mastodons-dubious-crawler-exemption https://github.com/mastodon/mastodon/issues/21738

mastodon.mit.eduJeff Kaufman (@jefftk@mastodon.mit.edu)Either Mastodon's link preview bot should obey robots.txt or Mastodon needs O(1) link previews: https://www.jefftk.com/p/mastodons-dubious-crawler-exemption

**George Ellenburg (he/him/his)** @gme@bofh.social · Nov 27, 2022

Nov 27, 2022

George Ellenburg (he/him/his) @gme@bofh.social

It's 2022. Use a CDN. Cloudflare is free.

**Tim W** @tw@cantos.social · Nov 27, 2022

Nov 27, 2022

Tim W @tw@cantos.social

@gme @crschmidt @cshabsin @jefftk That's a pretty dismissive take on software violating an agreed-upon Internet standard...

**George Ellenburg (he/him/his)** @gme@bofh.social · Nov 27, 2022

Nov 27, 2022

George Ellenburg (he/him/his) @gme@bofh.social

I read the blog post and at the very top OP even admits that Mastodon is not a crawler. So what "standard" is being broken?

**Jeff Kaufman** @jefftk@mastodon.mit.edu · Nov 27, 2022

Nov 27, 2022

Jeff Kaufman @jefftk@mastodon.mit.edu

@gme @crschmidt @tw @cshabsin where do you see that in the blog post? I agree that scraping a preview isn't crawling if you do it at send time, but doing it at automatically at retrieve time is

**George Ellenburg (he/him/his)** @gme@bofh.social · Nov 27, 2022

Nov 27, 2022

George Ellenburg (he/him/his) @gme@bofh.social

Also:

haliphax @haliphax@hachyderm.io

@gme @crschmidt @tw @cshabsin @jefftk
Lack of alt text in a text only image... I think many clients even offer to OCR it for you. This small thing goes a long way to making this place more open and accessible!

Nov 28, 2022, 01:42 PM··Web

0boosts·0favorites

Drag & drop to upload

Recent searches

Search options

Administered by:

Server stats:

Recent searches

Search options

Administered by:

Server stats:

Back