hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

9.4K
active users

In the old days, you could change a letter between upper and lower case by XORing its character code with 0x20. Of course, if you tried this with anything that wasn't a letter, you'd get nonsense results.

If you try that with code points, it sometimes works, and sometimes doesn't. But Unicode can deliver much more impressive nonsense when it doesn't.

A fun example I just found: the "lower-case" version of CAR is NO PEDESTRIANS.

>>> chr(ord('🚗') ^ 0x20)
'🚷'

@simontatham Pardon my straying from the joke, but FYI: libc has a super compact representation for finding the rule for case mapping for a particular character. Under 5k for tables plus code. We don't have the car emoji mapping tho.

Simon Tatham

@dalias indeed, Unicode does carefully specify its own rule for translating between upper and lower case, which will tell you when there's no such mapping available, and also work correctly when there is one but it doesn't follow the 'xor with 0x20' rule.

(Fun fact: the 'xor with 0x20' rule works for half the Greek alphabet but not the other half, because the two cases of Greek are separated by 0x20 but offset by 0x10. E.g. the xor rule maps Γ to γ as you'd like, but Σ and σ each map to an unrelated thing.)

But if I'd used the proper Unicode case mapping rules then the joke wouldn't have worked :-)