🎉20% off all new orders until September — use code NewStart20 at checkout. Applies to subscriptions (first month) and lifetime plans.Read what changed →
Home/Blog/How QR Codes Store Data
Technical

How QR Codes Store Data: Binary, Modules, and Error Correction Explained

March 25, 20268 min read

You scan a QR code and your phone opens a URL. Obvious enough. But what is actually happening inside those black and white squares? This article traces the full path — from how computers represent any information as binary, through how that binary gets laid out in a physical grid, to how a damaged or partially obscured code still scans correctly.

Everything is binary — why?

Every piece of information a computer handles — text, images, video, a URL — is ultimately stored and transmitted as a sequence of zeros and ones. This is not an arbitrary choice. It is a consequence of how electronics work.

A transistor, the basic building block of all modern computing hardware, has two stable states: on or off. A voltage is present, or it is not. An electrical signal is high or low. Magnetic polarity points one way or the other. Flash memory cells hold a charge or they do not. In every case: two states. Two states map perfectly to two digits — 0 and 1.

This is the same principle as Morse code — dots and dashes, two symbols, enough to encode the entire alphabet. Binary is Morse code for machines. With enough digits, you can represent any number, any character, any colour, any sound. Two states, infinite expressiveness.

How different technologies represent binary:

CPU / RAM0 = Low voltage (~0 V)1 = High voltage (~3.3–5 V)
Hard disk (HDD)0 = Magnetic domain facing left1 = Magnetic domain facing right
SSD / Flash0 = No charge in floating gate1 = Charge stored in floating gate
CD / DVD0 = Flat land (no pit)1 = Pit edge (transition)
QR code0 = White module1 = Black module

QR codes are on that list. A black square (module) represents a 1. A white square represents a 0. The entire code is nothing more than a two-dimensional array of bits, physically printed.

From text to binary: ASCII and encoding

To convert a number to binary, keep dividing it by 2 and collect the remainders. Read those remainders from bottom to top — that is your binary number. Take 72 (the ASCII code for the letter "H"):

72 → binary, one step at a time:

772÷ 2 =36remainder0
636÷ 2 =18remainder0
518÷ 2 =9remainder0
49÷ 2 =4remainder1
34÷ 2 =2remainder0
22÷ 2 =1remainder0
11÷ 2 =0remainder1← read from here
Remainders bottom to top:1 0 0 1 0 0 0= 01001000 (padded to 8 bits)

Every number converts to binary this way. But text requires a further step: agreeing which number each character maps to.

The most influential such convention is ASCII (American Standard Code for Information Interchange), published in 1963. ASCII assigns a number from 0 to 127 to every character that matters for English: uppercase and lowercase letters, digits, punctuation, and a handful of control characters. That number is then stored as a 7-bit or 8-bit binary sequence.

"Hello" encoded in ASCII → binary:

CharacterASCII (decimal)Binary (8 bits)
H7201001000
e10101100101
l10801101100
l10801101100
o11101101111

Full string as a binary stream: 01001000 01100101 01101100 01101100 01101111

ASCII covers English well but not the rest of the world. Modern systems use UTF-8, which extends ASCII: the first 128 code points are identical (so ASCII is valid UTF-8), but UTF-8 can represent over a million characters — every script, emoji, and symbol in active use. A URL encoded in a QR code is almost always UTF-8.

For numeric-only content (a phone number, a product ID) QR codes use a compact "numeric mode" that packs 3 decimal digits into 10 bits instead of 24 bits — significantly reducing the size of the resulting code. For URLs and general text, "byte mode" (UTF-8) is standard.

A QR code is a physical binary storage format

Once your data exists as a stream of bits, it needs to be stored somewhere. A hard drive stores bits as magnetic polarity. A CD stores them as microscopic pits. A QR code stores them as a printed grid of dark and light squares — modules.

This was the insight behind Denso Wave's 1994 design. Standard barcodes are one-dimensional — they encode bits in a single row of lines. That limits capacity. A two-dimensional grid uses both width and height, so the same physical area holds dramatically more data. A version 1 QR code (21×21 modules) stores up to 41 characters. A version 40 QR code (177×177 modules) stores up to 7,089 digits or 4,296 alphanumeric characters.

Numeric only

7,089 digits

Phone numbers, IDs

Alphanumeric

4,296 characters

URLs, codes (uppercase)

Binary / UTF-8

2,953 bytes

Full URLs, text, emojis

In practice, most QR codes hold a URL of 20–80 characters. A version 3 or 4 code (29–33 modules per side) is usually sufficient, producing the compact, easy-to-scan pattern you see on business cards and restaurant tables.

The seven zones of a QR code

Not all modules in a QR code carry your data. The grid is divided into functional regions, each with a specific role. Here is what each zone does:

1

Finder patterns

The three large squares in the top-left, top-right, and bottom-left corners. Their job is purely positional — they tell the scanner where the code starts, which way is up, and the overall tilt/skew of the image. No data is stored here.

2

Separator

A one-module-wide white border around each finder pattern. Creates a clear boundary so the scanner does not accidentally read the finder as data.

3

Timing patterns

Alternating black–white–black–white stripes running horizontally and vertically between the finder patterns. They establish the module grid coordinates — the scanner uses them to calculate exactly where each module sits, even on a distorted or curved surface.

4

Alignment patterns

Smaller square patterns that appear in larger QR codes (version 2 and above). They help correct for perspective distortion — if the code is photographed at an angle, alignment patterns let the decoder remap the grid accurately.

5

Format information

Two strips of 15 bits each adjacent to the finder patterns. They encode the error correction level (L / M / Q / H) and the mask pattern used on the data. Stored twice for redundancy.

6

Version information

Present in version 7 codes and above. Encodes the version number (7–40) as an 18-bit sequence so the scanner knows the grid dimensions before it starts reading data.

7

Data and error correction codewords

Everything else — the rest of the grid. This region holds your actual encoded content plus the Reed-Solomon error correction codewords. The data is written in a zigzag pattern from the bottom-right corner upward, skipping all the reserved zones above.

Simplified zone map (version 1, 21×21):

FinderTimingFormatData + ECC

Colour-coded zone map of a version 1 QR code. The actual pattern varies by content.

Error correction: how damaged codes still scan

QR codes use Reed-Solomon error correction — the same algorithm used in CDs, DVDs, and deep-space communications. The algorithm adds redundant codewords alongside the data. If some modules are missing or unreadable, the decoder uses the redundant data to reconstruct the original.

There are four error correction levels, each trading capacity for resilience:

LevelNameData recoveryBest use case
LLowUp to 7%Clean environments, digital displays
MMediumUp to 15%General purpose (default in most tools)
QQuartileUp to 25%Printed materials, light wear expected
HHighUp to 30%Logos, outdoor use, heavy wear

Higher error correction = more redundant codewords = less room for actual data = more modules needed = bigger, denser code for the same content. This is the core trade-off.

When you add a logo to a QR code, you are intentionally obscuring a portion of the data region. This works because the error correction fills in the missing bits. Level H — 30% recovery — is recommended when adding a logo. If the logo covers more than ~30% of the data region, the code will fail to scan no matter how good the error correction is.

What this means when you create a QR code

Understanding the structure has direct practical consequences:

Shorter URLs scan more reliably

Less data = fewer modules needed = lower version = larger, easier-to-read modules at any given print size. A URL shortener or dynamic QR code (which encodes a short redirect) produces a cleaner code than encoding a long URL directly.

Adding a logo requires high error correction

Set error correction to H when adding a logo. In AddToQR's designer, this is always the default when a logo is present. Keep the logo under 30% of the total code area and centred — the finder patterns in the corners must stay fully visible.

Colour choices affect contrast — not encoding

The scanner reads contrast, not colour. Black on white is easiest. Dark modules on a light background works fine. Light modules on a dark background typically fails. The finder patterns especially need strong contrast to be detected.

Dynamic codes are smaller for the same destination

A dynamic QR code encodes a short redirect URL (e.g. addtoqr.com/r/abc123) rather than the full destination. That short URL needs far fewer modules, resulting in a simpler pattern that scans faster and tolerates more damage.

Try it yourself

Create a QR code in AddToQR's designer — choose your error correction level, add a logo, and preview scanability before downloading.