In 2003, the conventional wisdom was clear: to efficiently compress differences between compiled binaries, you needed a complete decompiler. You had to understand the executable format, parse the instruction set, track relocations, analyze the symbol table. Only then could you identify what actually changed between versions.
Then a PhD student at Oxford published a 4-page paper with a radically different approach. Instead of trying to understand the binary structure, he treated executables as opaque byte streams and applied a surprisingly simple technique: bytewise subtraction of almost matching regions.
The results spoke for themselves. On a FreeBSD security update—97 modified binaries, 36MB of changes—traditional tools produced 3.3MB patches. The commercial tool .RTPatch generated 750KB. This new algorithm? Just 621KB.
That's a 58x compression ratio achieved without understanding a single line of assembly.
The algorithm was called bsdiff, and it fundamentally changed how the industry thought about binary patching. Within a few years it became the foundation for software updates everywhere—from macOS to Chrome to mobile app stores. Today, every time your phone downloads a small security patch instead of re-downloading gigabytes of applications, you're benefiting from this counterintuitive insight.
What makes this algorithm so effective? And why does a relatively simple technique from 2003 still power billions of software updates today? Let's explore how bsdiff works.
The problem with binary patching
Traditional binary patch tools rely on simple COPY and ADD operations—they find matching regions and copy them from the old file, adding new bytes for everything else. This works fine for large changes, but completely falls apart for small security patches. You can avoid some of these by understanding the binary structure and decompiling, but that's not easy.
Here's the problem: when you fix a one-line bug and recompile, that tiny change cascades through the entire binary. Every function after the change shifts to a new address. Every pointer, function call, and jump instruction that references those addresses must update. A one-line source change can balloon into thousands of individual changes in the binary.
Bsdiff's insight
Instead of treating address changes as completely new data, bsdiff does something clever: it computes the bytewise difference between matching regions. Here's what that looks like:
This is the magic of binary subtraction. After computing the difference, you're left with a file that's mostly zeros, with the only non-zero data being the address offset—which is constant throughout the file (in this case, +4 for every address).
Any compression algorithm—whether bzip2 (used in the original paper), zstd, brotli, or lzma—is exceptionally good at compressing files full of zeros and repeated patterns. A traditional COPY/ADD tool would encode all those shifted addresses as new data—dozens of bytes per address change. Bsdiff encodes them as a stream of mostly zeros with a repeating +4 pattern—which compresses down to almost nothing.
That's why a 50KB traditional patch can become a 1KB bsdiff patch. You're not storing the addresses themselves—you're storing the predictable, highly compressible differences.
The algorithm produces three outputs:
- Control file: ADD and INSERT instructions for reconstruction
- Diff file: Bytewise differences in matched regions
- Extra file: Completely new bytes not in the old version
When compressed with bzip2, these three files become extraordinarily small. For security updates—the most common type of software patch—Colin's paper showed bsdiff achieved 58.3x compression. That means a 58MB binary could be updated with just a 1MB patch.
Google's Courgette, which powers Chrome updates, is essentially bsdiff with a preprocessing step: it disassembles the executable to normalize relative addresses before running the core bsdiff algorithm. Same elegant idea, just with a hat on.
Try it yourself
See bsdiff in action with this interactive demo. Enter two versions of text to see how the algorithm breaks down the patch into DIFF sections (matched regions with bytewise differences) and EXTRA sections (completely new data). In DIFF sections, bytes with diff value 0 are unchanged (shown in green), while non-zero values indicate changed bytes (shown in blue).
A bsdiff implementation in C is available at GitHub, and the original paper "Naive differences of executable code" by Colin Percival provides excellent technical details.
bsdiff is a great reminder that sometimes complicated problems warrant simpler solutions.
"I would have written a shorter letter, but I did not have the time." — Blaise Pascal (though often misattributed to Mark Twain—both of whom would appreciate simpler software tools)