|Zodiac Killer code cracked? Cryptography at its worst|
|Written by Alex Armstrong|
|Sunday, 31 July 2011|
This is a story about the misuse of cryptography and it stands as a warning not to believe everything that seems to be clear cut from an algorithmic point of view.
Consider the following facts:
In the late 1960s early 1970s a serial killer code named Zodiac sent encrypted letters to the local newspapers taking credit for the killings and warning of more to come. The first letter's code was cracked in 1969 by an amateur and now the most famous of the set, a 340-character letter, has been cracked by another amateur.
The main proof that the message has been cracked is simply that the result is meaningful:
One of the main suspects was called Leigh Allen so it really does all fit.
The proof of decoding is that you have something intelligible and this is a long-standing principle of cryptography.
So end of story, the code has been cracked.
Well no not really.
The principle used to detect a successful decode depends on the use of a regular algorithm to decode the message. The algorithm can be complex but it has to have a regularity that is controlled by a small number of parameters. If you allow too many degrees of freedom in the decoding algorithm then it can be "arranged" to produce a result that is meaningful but has no connection with the original plain text.
If you look more closely at the decryption process used on the Zodiac letter it begins to reveal rather too many degrees of freedom. First the non-alphabetic symbols are converted to the alphabetic characters they most resemble. This part is reasonable. Next it is assumed that the code is a modified Caesar cipher, i.e. where each character is mapped to another in the alphabet with a constant shift. For example a Caesar cipher with a shift of 3 maps A into D.
As most programmers know, the Caesar cipher is easy to crack because it preserves the usual n-gram frequencies of character used in a language. If the Zodiac letter were a simple Caesar cypher it would have been decrypted long ago and this argument has been used to suggest that the current cracking of the code by an amateur must be a hoax.
However this reasoning is wrong because the code used in the cracking isn't a simple Caesar cipher but a form of Vigenère cipher. In this case the shift is varied in a regular pattern as specified by a key. For example if the key is 1234 you would shift the first letter by 1, second by 2 and so on. When the key is used up you start at the beginning again.This too is easy to crack if the message is long enough because the key is used repeatedly and this introduces statistical regularities which can be used to find the key. However if you use a key as long as the message and only use the key once - a so called one time pad - then the code is uncrackable. So it all depends on how many degrees of freedom the key is given.
Now consider the way you could decrypt a supposed Vigenère cipher - you basically can pick a shift at each point of the encrypted text to give you any decoded letter you like. You have as many degrees of freedom as there are letters in the message. To be reasonably plausible you have to provide either a short repeating key or a key that has an understandable regularity.
This is what Corey Starliper has done. He reasoned that the key might be related to the number 340 - the number of characters in the message and from this he deduces relationships with area telephone codes. Anyway, for whatever reason, he starts off the key 34 and gets KI so the next two shifts have to be 6 and 3 to give KILL. So from there you can repeat 346346 to get KILL SL and so on. Only there is an error in the computation (see An analysis of the crack as a hoax) and the actual key used is 3 4 6 10 3 0 5 and so on. The error destroys what little regularity there is in the proposed key and after the initial attempt at regularity the key values are changed to complete a word without much additional justification. For example 3 4 gives HE so to get an L the next key value is changed to 3 not 6 and to get a P the next value is 4 not a 3.
So the bottom line is not "does the reconstructed plain text make sense?", but "does the key used have a regularity that makes it less than arbitrary?"
In this case, despite the number of media reports that say "ZODIAC CODE CRACKED" I think it is very doubtful that there is any meaningful regularity, especially so if you take into consideration the errors in reporting the key sequence.
So we can only conclude with:
ZODIAC CODE NOT CRACKED
To be informed about new articles on I Programmer follow us on Twitter or Facebook or subscribe to our weekly newsletter.
|Last Updated ( Tuesday, 16 August 2011 )|