hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

8.9K
active users

Hilarious: TIL cgit does unbuffered writes generating its HTML output, several bytes at a time whenever it has to do entity escaping. And it's STILL orders of magnitude faster than any other web-based git browsing software. 🤦

@dalias Surely it doesn't have TCP_NODELAY, right? Kernel is still going to aggregate those writes into more sensibily shaped frames I hope :(

@astraleureka The httpd already has to do that anyway because it needs to determine content-length, I think. It's just gratuitously having hundreds of times more syscall entry/exit overheads than it should. But that doesn't matter because that's still orders of magnitude smaller than all the other web stuff layers.

@dalias Content-length is optional (e.g. streamed payloads), CGI programs are responsible for supplying the headers and can omit everything but Content-type in the simplest case. The syscall noise is wasteful but yeah, probably not noticeable on anything remotely modern :D

I've found a lot of crap quality CGI programs do this, and it does become measurably bad in environments like low powered home routers/APs.

@astraleureka @dalias Yes, but streaming without chunked and without Content-Length means you rely on EOF to determine the end of your payload, and that's crappy for several reasons:
1. You cannot reuse the connection for further HTTP exchanges, which negates the second best benefit of 1.1
2. You force the SSL layer to use close_notify at the end (else you get the security issue that deprecated SSL 3). That prevents full duplex which causes a shitload of annoyances with detecting when the TCP session ends (the Linux socket layer hates that and will never exit TCP_WAIT and I've had to put timeouts in the application layer to prevent hanging processes).

It's basically HTTP 0.9, that we did away with for a reason. I don't want to support that as a first class use case, so if no Content-Length, tipidee slurps the whole CGI data and computes the length itself before sending it to the client.

Hmmm. Maybe I should make it chunk the data itself instead.

@ska @astraleureka Chunking would be much better behavior.

@dalias @ska @astraleureka one case where buffering the whole response and computing the content-length wouldn't work is Internet radio streams (icecast, shoutcast, etc). They just stream data forever.

Granted I'm not sure any folks out there are going to implement such streams as CGI/NPH scripts so, probably a moot point.

Cassandrich

@jprjr @ska @astraleureka I think best behavior would be chunking in some reasonable max unit size but also ending a chunk early if there's data buffered and no new data seen for a few hundred ms.