QR Code Demystified – Part 3

Now we’ll cover how the data is encoded. There are several steps involved. First, the encoding method is chosen, then the raw data is converted to binary based on the encoding method, then the error correction algorithm is applied, and then the data is placed in the symbol. Finally a mask is selected and applied.

For now I’ll cover the encoding methods, and conversion to binary, and save the rest for later.

Before I go any further, I want to point out that an understanding of dealing with the binary number system will be needed. Because it’s very easy to get an explanation of it, and many programmers are already familiar with it, I’m going to assume this understanding. If you need an explanation or refresher course, try Wikipedia or a Google search. Further, I’m going to refer to adding bits to your data. Think of this as a stream of bits that will later be placed in the correct order on the symbol. I’ll explain exactly how that’s done later, but for now, just imagine that you’re creating a chain of 1s and 0s. I might add spaces between some of the bits to make them more readable, but they’re not part of the binary data.

The encoding methods are Numeric, Alphanumeric, Binary, and Kanji. Numeric only supports the digits 0-9, but can store 3 of them in only 10 bits. Alphanumeric supports letters A-Z (upper-case only), digits 0-9, and the special characters $%*+-./: and space. It’s good for encoding URLs and simple text. It takes 11 bits to store 2 alphanumeric characters. Binary data is stored 8 bits per character, and supports the 256 characters in the extended ASCII table. Kanji takes 11 bits for a single character. I won’t go into detail on the Kanji, because I’m guessing that very few people reading this tutorial will need to encode Japanese. So if you do, you’ll have to settle for Romaji or find the details elsewhere.

In order to encode data with one of these methods, we first indicate which method we’re using, and how much data we’re storing. We indicate the method with four bits – Numeric is 0001, Alphanumeric is 0010, Binary is 0100, and Kanji is 1000. The encoding method determines how many bits we use to indicate the data length. Details are in Table 1 below. As an example, if we’re encoding 5 binary characters to a Version 1 symbol, the binary data we start with will be 0100 00000101. (0100 to indicate binary, and 00000101 is the 8-bit representation of 5, to indicate the data length.) It is also possible to use different methods, by appending a new method/size indicator after the previous data, followed by the new data. (In other words, Method1, Size1, Data1, Method2, Size2, Data2)

Table 1 – Bits used to indicate data length

	Version 1-9	Version 10-26	Version 27-40
Binary	8 bits	16 bits	16 bits
Alphanumeric	9 bits	11 bits	13 bits
Numeric	10 bits	12 bits	14 bits

Once we’ve indicated what we’re storing, now it’s time to actually add the data. Binary is the easiest – simply take the 8-bit representation of the character and add it to your data. For the numeric data, you take sets of three digits. For each set of three you encode them directly to their 10-bit binary representations, so you encode “123456” as one-hundred twenty-three followed by four-hundred fifty-six. (0001111011 0111001000) If at the end of your data you have 1 digit left, encode it to four bits, and if you have 2 digits left, encode it into seven bits. Alphanumeric data is a little trickier. You have to convert each character to its numerical value – see Table 2. Then you take pairs of characters, multiply the first numerical value by 45, and add the second numerical value. Then convert the pair to 11-bit binary. If you end up with one character left over at the end of your data, encode its value to 6-bits. For example, ABC would be (A=10*45=450) + (B=11) = 461 and C=12. 461 in 11-bit binary is 00111001101 and 12 in 6-bit is 001100, so 00111001101001100.

Table 2 – Character values in alphanumeric mode

0	1	2	3	4	5	6	7	8	9	A	B	C	D	E
0	1	2	3	4	5	6	7	8	9	10	11	12	13	14

F	G	H	I	J	K	L	M	N	O	P	Q	R	S	T
15	16	17	18	19	20	21	22	23	24	25	26	27	28	29

U	V	W	X	Y	Z	(sp)	$	%	*	+	–	.	/	:
30	31	32	33	34	35	36	37	38	39	40	41	42	43	44

After you’ve encoded all your data, if there’s any space left over, you need to add some padding. First, add 0000. Then, if the number of bits in your data isn’t divisible by 8, add 0s until it is. Then alternately add “11101100” and “00010001” until you’ve reached the limit for your version and error correction mode (Table 3). If you’ve reached your limit just from your actual data, none of this is necessary.

Table 3 – Maximum bits for data

	1	2	3	4	5	6	7	8	9	10
L	152	272	440	640	864	1088	1248	1552	1856	2192
M	128	224	352	512	688	864	992	1232	1456	1728
Q	104	176	272	384	496	608	704	880	1056	1232
H	72	128	208	288	368	480	528	688	800	976

	11	12	13	14	15	16	17	18	19	20
L	2592	2960	3424	3688	4184	4712	5176	5768	6360	6888
M	2032	2320	2672	2920	3320	3624	4056	4504	5016	5352
Q	1440	1648	1952	2088	2360	2600	2936	3176	3560	3880
H	1120	1264	1440	1576	1784	2024	2264	2504	2728	3080

	21	22	23	24	25	26	27	28	29	30
L	7456	8048	8752	9392	10208	10960	11744	12248	13048	13880
M	5712	6256	6880	7312	8000	8496	9024	9544	10136	10984
Q	4096	4544	4912	5312	5744	6032	6464	6968	7288	7880
H	3248	3536	3712	4112	4304	4768	5024	5288	5608	5960

	31	32	33	34	35	36	37	38	39	40
L	14744	15640	16568	17528	18448	19472	20528	21616	22496	23648
M	11640	12328	13048	13800	14496	15312	15936	16816	17728	18672
Q	8264	8920	9368	9848	10288	10832	11408	12016	12656	13328
H	6344	6760	7208	7688	7888	8432	8768	9136	9776	10208

The error correction is quite complicated, so we’ll cover that next time. For now, I’ll say that QR Codes use the Reed-Solomon error correction algorithm. You can read about it at Wikipedia. After that, I’ll cover the placement order of the modules and the masking system. If you haven’t already, you’ll definitely want to check out the previous parts of this tutorial – Part 1 and Part 2.

About Matcha Design

Matcha Design is a full-service creative B2B agency with decades of experience executing its client’s visions. The award-winning company specializes in web design, logo design, branding, marketing campaign, print, UX/UI, video production, commercial photography, advertising, and more. Matcha Design upholds the highest personal standards for excellence and can see things from a unique perspective due to its multicultural background. The company consistently delivers custom, high-quality, innovative solutions to its clients using technical savvy and endless creativity. For more information, visit MatchaDesign.com.

QR Code Demystified – Part 3

Table 1 – Bits used to indicate data length

Table 2 – Character values in alphanumeric mode

Table 3 – Maximum bits for data

About Matcha Design

Related Tags

You Might Also Like

The IP Trap in AI Design: What Every Business Owner Should Know Before Publishing AI Art

Brand Drift: Why Your AI-Generated Content Stopped Looking Like You

How to Get Your Brand Cited by ChatGPT and Perplexity (Not Just Ranked on Google)

(918) 749 2456