Micru Blog Logo

How Base64 Encoding Works

A step-by-step breakdown of the Base64 algorithm: how binary data becomes printable text, why padding exists, and what those trailing equals signs actually mean.

Micru 5 min read

Base64 is everywhere in web development, but most developers use it without ever stopping to understand how it works. This article walks through the algorithm from first principles. By the end, the output of a Base64 encoder will make complete sense.

If you want to follow along interactively, our Base64 encoder at base64.micru.org lets you encode and decode text in real time.

The Problem Base64 Solves

Computers store everything as binary data: sequences of bytes, each with a value from 0 to 255. Most of those values map to perfectly printable characters, but a large chunk of them are control codes that have special meaning in text protocols. A null byte (value 0) terminates a string in C. A newline (value 10) ends a line in most protocols. A carriage return (value 13) does the same on Windows.

When you need to send arbitrary binary data through a system that was designed for text, you have a problem. The system might misinterpret your data as control characters and corrupt it.

Base64 solves this by converting binary data into a restricted set of 64 characters that are safe in every text-based system: letters (A-Z, a-z), digits (0-9), plus (+), and slash (/). No control characters, no ambiguity.

The Alphabet

Every Base64 character represents a number from 0 to 63. The mapping is:

  • 0-25: uppercase letters A through Z
  • 26-51: lowercase letters a through z
  • 52-61: digits 0 through 9
  • 62: plus sign (+)
  • 63: forward slash (/)

A 64th special character, the equals sign (=), is used for padding. More on that shortly.

Encoding Step by Step

The core idea is to regroup bits. A standard byte is 8 bits. Base64 uses 6-bit groups instead, because 2^6 = 64, which is exactly the size of the alphabet.

The algorithm works in chunks of 3 bytes (24 bits) at a time, because 3 bytes and 4 Base64 characters both consume exactly 24 bits:

  • 3 bytes = 3 x 8 bits = 24 bits
  • 4 Base64 characters = 4 x 6 bits = 24 bits

A concrete example: encoding the word “Man”

The three characters M, a, and n have ASCII values 77, 97, and 110. In binary:

M = 01001101
a = 01100001
n = 01101110

Concatenated into 24 bits:

010011010110000101101110

Split into four 6-bit groups:

010011  010110  000101  101110

Convert each group to its decimal value:

19  22  5  46

Look up each value in the Base64 alphabet:

T  W  F  u

“Man” encodes to TWFu. You can verify this yourself at base64.micru.org.

Decoding

Decoding is the exact reverse. Each Base64 character is looked up in the alphabet to get its 6-bit value. Four characters produce four 6-bit groups, which are concatenated into 24 bits, then split back into three 8-bit bytes.

Why the Equals Signs?

Input data is not always a multiple of 3 bytes. When it isn’t, the algorithm needs padding to keep the output aligned to 4-character blocks.

If the input has one leftover byte (1 byte remaining):

One byte is 8 bits. To make it divisible into 6-bit groups, two zero bits are appended, producing 10 bits, which fills one complete 6-bit group and leaves 4 bits. That is padded to a second 6-bit group. The result is two Base64 characters followed by two = padding characters.

If the input has two leftover bytes (2 bytes remaining):

Two bytes are 16 bits. Appending two zero bits gives 18 bits, which divides into three 6-bit groups. The result is three Base64 characters followed by one = padding character.

If the input is an exact multiple of 3 bytes:

No padding is needed.

The equals signs are not data. They are a signal to the decoder about how many bytes are in the final chunk.

The Size Trade-off

Base64 encoding increases the size of your data. Every 3 bytes of input become 4 bytes of output, which is a 33% overhead. For large binary files this is significant. For short strings, configuration values, or small images embedded in a stylesheet, it is usually acceptable.

URL-Safe Base64

Standard Base64 uses + and /, both of which have special meaning in URLs. A URL-safe variant replaces + with - and / with _, producing output that can be safely included in a URL or filename without percent-encoding. You will often see this in JWT tokens and OAuth credentials.

Where You Will See Base64

  • JWT tokens - the header and payload sections are Base64url-encoded JSON objects
  • Data URIs - <img src="data:image/png;base64,..."> embeds images directly in HTML
  • Email attachments - MIME encoding uses Base64 to send binary files through email servers
  • HTTP Basic Auth - credentials are Base64-encoded before being placed in the Authorization header
  • Encoded config values - Kubernetes secrets, environment variables, and CI/CD pipelines often store binary values as Base64 strings

Try It Yourself

Now that you understand the algorithm, paste some text into base64.micru.org and watch the output. Try single characters, multi-byte UTF-8 strings, or upload an image and inspect the resulting data URI. The patterns described above will be visible in every result.