LZ Compression And The One-Bit Catastrophe
Written by Mike James   
Monday, 17 July 2017

We use compression algorithms extensively to save both storage and time. The most popular algorithms are based on LZ dictionary compression and are used in GIF, Deflate, Zip, PNG. It was even named an IEEE Milestone. But we know little about them. One long standing question is, can adding a single bit to a file dramatically change the compression achieved? Surely not!


LZ compression, specifically LZ 78, is of both practical and theoretical interest. It is one of a class of dictionary compression methods invented in 1978 by Abraham Lempel and Jacob Ziv.

A dictionary compression method is lossless and works by building up a dictionary of short sequences that can then be used to code the file in question. For example, if you had a file that was composed of nothing but 10,000  "A"s then the single dictionary entry "A" and the coding 10,000 would provide a very effective compression. In the real world, of course, files are made up of complex bit sequences but unless they are random some sequences repeat. The difficulty is finding them.

An optimal dictionary encoding gives an upper bound to the Kolmogorov complexity of a string. The Kolmogorov complexity is just the shortest program that will generate the string and clearly a compression algorithm (plus the dictionary) qualifies as a program and hence any other program that does the job has to be shorter.

With all this in mind, the question arises "can adding a single bit significantly change the compression ratio?". This is sometimes referred to as the "one-bit catastrophe".

"Suppose you compressed a file using your favorite compression algorithm, but you realize there were a typo that makes you add a single bit to the original file. Compress it again and you get a much larger compressed file, for a one-bit difference only between the original files. Most compression algorithms fortunately do not have this strange behaviour; but if your favorite compression algorithm is called LZ’78, one of the most famous and studied of them, then this surprising scenario might well happen. . ."

The question is as much about the nature of order and entropy as it is compression methods. You could rephrase this as "can a single bit significantly change the order in a bit sequence or string?"

The answer, revealed by Guillaume Lagarde and Sylvain Perifel at the Sorbonne, is that for infinite-bit sequences adding a single 0 to the start turns a compressible sequence into an incompressible one.

I guess you could say that a single bit can render order into disorder but that might be a bit theatrical.

For finite words the situation isn't quite as bad as the compressibility of a word with a bit added has to be close to the original word. But there are words that are easily compressible that are converted into words that are much more difficult to compress by adding a single bit.

This theoretical musing does have a practical consequence:

"This “catastrophe” shows that LZ’78 is not robust with respect to the addition or deletion of bits. Since a usual good behaviour of functions used in data representation is a kind of “continuity”, our results show that, in this respect, LZ’78 is not a good choice, as two words that differ in a single bit can have images very far apart."



More Information

Lempel-Ziv: a "one-bit catastrophe" but not a tragedy

Related Articles

Data Compression The Dictionary Way

Kolmogorov Complexity

Information Theory

Non-computable numbers


Axiom Of Choice - The Programmer's Guide 

The Programmer's Guide To The Transfinite

Claude Shannon - Information Theory And More

Fractal Image Compression


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.




Mitchell Baker Relinquishes CEO Role Before Mozilaa Announces Layoffs

In an announcement last week Mitchell Baker publicised her decision to step down from the role of CEO of Mozilla Corporation in order devote herself to overseeing Mozilla's broader vision. Now comes n [ ... ]

Love Learning With Udacity

To celebrate Valentine's Day, Udacity has a one-week promotion for new subscribers with 40% of the first month's payment. This seems like a good incentive for trying it out. Here we look at four cours [ ... ]

More News


raspberry pi books



or email your comment to: comments@i-programmer.info


Last Updated ( Monday, 17 July 2017 )