Encoding 2026-06-09 6 min read

Base64, Properly Explained

Base64 is older than HTTP. It first appears in 1989 as the heart of Privacy-Enhanced Mail (RFC 1113), a defunct attempt to add cryptography to email, and was canonized two years later in MIME. The reason it exists is much narrower than what people use it for today: SMTP was designed for 7-bit ASCII text, and anything else — binary, 8-bit characters, attachments — had to be smuggled through that 7-bit pipe somehow. Base64 was the smuggling technique that won.

Base64Base64urlMIMERFC 4648JWT

Most modern uses of Base64 inherit that 1989 lineage. Embedding a PNG in a CSS file, a public key in a JWT header, a webhook payload in a JSON string — all of them are doing the same thing 1989's email was doing: forcing binary through a transport that only understands text.

What it actually is

Base64 takes 3 bytes (24 bits) and re-expresses them as 4 characters from a 64-symbol alphabet — A–Z, a–z, 0–9, plus two extras. Twenty-four bits split into four groups of six is exactly 64 possibilities per group, which is where the 64 in the name comes from. The output is 33% larger than the input, which is the price of using only printable ASCII to represent arbitrary bytes.

If your input length isn't a multiple of 3 bytes, the encoder pads the last group with one or two = characters so output length is always a multiple of 4. That's it. Nothing more clever is happening.

This is why people calling Base64 "encryption" are wrong in a way that should be embarrassing. Base64 is a syntactic transformation with a public, fixed alphabet. Anyone reading aGVsbG8= can decode it in their head with practice, and any decoder does it in microseconds. It hides nothing.

The two alphabets

Standard Base64 (RFC 4648 §4) uses + and / as the two non-alphanumeric symbols and = for padding. This is fine for email and most data formats, but +, /, and = all have special meaning in URLs — + is sometimes interpreted as a space, / is a path separator, = is reserved in query strings. Putting standard Base64 in a URL works only if you're disciplined about percent-encoding it first, and most people aren't.

The same RFC defines Base64url (§5), which swaps + → - and / → _, and lets you omit the padding. JWTs, WebAuthn, OAuth state tokens, and most modern APIs use Base64url for exactly this reason. The two alphabets are not interoperable. Decoding a Base64url string with a standard Base64 decoder produces garbage at best and a hard error at worst, depending on how strict the implementation is about its alphabet.

If you can't decode something that looks like Base64, the first thing to try is swapping alphabets.

Where it's misused

The pattern I see most often: someone wants to "send a JSON object" inside another JSON object, or inside a header, or inside a URL parameter. They Base64-encode the JSON to dodge escaping issues. This works, but it's almost always the wrong choice — the receiver now has to Base64-decode and JSON-parse separately, errors are buried one layer deeper, and the payload is 33% larger for no good reason. JSON inside JSON should be a JSON string with proper escaping; a header should be a structured field; a URL parameter should be percent-encoded.

The exception is when the data really is binary — public keys, certificates, image bytes, encrypted blobs. There Base64 is the correct tool. For genuinely textual data inside textual transports, it's overhead masquerading as a fix.

The other misuse is data URIs. Inlining a 200KB image as data:image/png;base64,... in HTML or CSS turns 200KB into 270KB and disables browser caching of that resource entirely. There are reasons to do it for very small icons where saving the HTTP request beats the bloat, but the threshold is around 1–4KB, not 200KB.

How decoders fail

Real-world Base64 strings contain a surprising amount of off-spec garbage:

Whitespace. RFC 4648 says decoders MAY ignore line breaks and spaces. Most do, but some don't. If you've ever seen a Base64 string break decoding when copy-pasted from an email, this is why — Outlook line-wraps at 76 characters and inserts CRLF. PEM files do the same on purpose.
Wrong alphabet. Standard vs. URL-safe, as above. The two are silently incompatible.
Missing or extra padding. Strict decoders reject aGVsbG8 (no padding). Lenient ones accept it. Base64url often omits padding entirely, so a Base64url decoder that requires padding is broken.
Non-canonical encoding. When the input length isn't a multiple of 3, the trailing group has unused bits. RFC 4648 says encoders must zero-pad those bits; not all do. Two encoders can therefore produce two different valid Base64 strings for the same input bytes. Most decoders accept both forms; security-sensitive code (signed JWTs, deduplication keys) should reject non-canonical encodings to prevent malleability attacks.

If a Base64 string won't decode, run it through tr -d '[:space:]', swap -_ for +/, and add = padding to the next multiple of 4. That fixes the majority of real-world breakage.

What it doesn't do

It doesn't encrypt. (Worth saying twice.)
It doesn't compress. The output is always 33% larger.
It doesn't fix character-encoding problems. If you have UTF-8 bytes and you Base64-encode them, you have a Base64 string of UTF-8 bytes; the consumer still has to UTF-8-decode after Base64-decoding.
It doesn't make binary "safe to print." A Base64 string is safe to put in 7-bit ASCII transports and most text fields, but +, /, and = will still trip up shell quoting, SQL identifiers, and URL parsers.

Practical rules

For URLs, headers, filenames, and anything path-shaped: Base64url.
For email, MIME, certificates, and PEM files: standard Base64.
Don't Base64-wrap textual data that's already going through a textual transport. Solve the escaping problem at the right layer.
Don't inline anything bigger than a few KB as a data URI.
If a decode fails, try the other alphabet before assuming the data is corrupt.
Never trust Base64 to hide anything. It's a transport convention, not a secret.

Try it in your browser

Encode and decode both alphabets locally. Useful when a string that should be Base64 won't decode and you need to swap +/_ before assuming it's corrupt. Nothing leaves your browser.

Open the Base64 tool

Related guides

Keep the session useful with adjacent reading instead of exiting after one article.

View all guides

QR Code 2026-06-10