hachyderm.io is one of the many independent Mastodon servers you can use to participate in the fediverse.
Hachyderm is a safe space, LGBTQIA+ and BLM, primarily comprised of tech industry professionals world wide. Note that many non-user account types have restrictions - please see our About page.

Administered by:

Server stats:

9.5K
active users

Why are QR Codes with capital letters smaller than QR codes with lower-case letters?

shkspr.mobi/blog/2025/02/why-a

Take a look at these two QR codes. Scan them if you like, I promise there's nothing dodgy in them.

   

Left is upper-case HTTPS://EDENT.TEL/ and right is lower-case https://edent.tel/

You can clearly see that the one on the left is a "smaller" QR as it has fewer bits of data in it. Both go to the same URl, the only difference is the casing.

What's going on?

Your first thought might be that there's a different level of error-correction. QR codes can have increasing levels of redundancy in order to make sure they can be scanned when damaged. But, in this case, they both have Low error correction.

The smaller code is "Type 1" - it is 21px * 21px. The larger is "Type 2" with 25px * 25px.

The official specification describes the versions in more details. The smaller code should be able to hold 25 alphanumeric character. But https://edent.tel/ is only 18 characters long. So why is it bumped into a larger code?

Using a decoder like ZXING it is possible to see the raw bytes of each code.

UPPER

20 93 1a a6 54 63 dd 28   35 1b 50 e9 3b dc 00 ec11 ec 11 

lower:

41 26 87 47 47 07 33 a2   f2 f6 56 46 56 e7 42 e746 56 c2 f0 ec 11 ec 11   ec 11 ec 11 ec 11 ec 11ec 11 

You might have noticed that they both end with the same sequence: ec 11 Those are "padding bytes" because the data needs to completely fill the QR code. But - hang on! - not only does the UPPER one safely contain the text, it also has some spare padding?

The answer lies in the first couple of bytes.

Once the raw bytes have been read, a QR scanner needs to know exactly what sort of code it is dealing with. The first four bits tell it the mode. Let's convert the hex to binary and then split after the first four bits:

TypeHEXBINSplitUPPER20 9300100000 100100110010 000010010011lower41 2601000001 001001100100 000100100110

The UPPER code is 0010 which indicates it is Alphanumeric - the standard says the next 9 bits show the length of data.

The lower code is 0100 which indicates it is Byte mode - the standard says the next 8 bits show the length of data.

TypeHEXBINSplitUPPER20 9300100000 100100110010 0000 10010lower41 2601000001 001001100100 000 10010

Look at that! They both have a length of 10010 which, converted to binary, is 18 - the exact length of the text.

Alphanumeric users 11 bits for every two characters, Byte mode uses (you guessed it!) 8 bits per single character.

But why is the lower-case code pushed into Byte mode? Isn't it using letters and number?

Well, yes. But in order to store data efficiently, Alphanumeric mode only has a limited subset of characters available. Upper-case letters, and a handful of punctuation symbols: space $ % * + - . / :

Luckily, that's enough for a protocol, domain, and path. Sadly, no GET parameters.

So, there you have it. If you want the smallest possible physical size for a QR code which contains a URl, make sure the text is all in capital letters.

This blog post was exhibited at QR Show, NYC

QR CODE
Terence Eden’s Blog · Why are QR Codes with capital letters smaller than QR codes with lower-case letters?
More from Terence Eden
Tanguy ⧓ Herrmann

@blog that's amazing. I didn't know about that.
As I like my QRCode tiny so they can be scanned further away, I already created an URL shortener to redirect to it. But now, I'm gonna make it uppercase for this reason.

Thank you very much for this detailed explanation!