|Knuth's 21st Not Christmas Tree Lecture|
|Thursday, 24 December 2015|
This year's annual lecture by Donald Knuth lecture is a keeper. It is on a topic that will be understandable even if you don't know a lot about it - commafree codes.
Donald Knuth should be known to you but just in case - he invented the TeX computer typesetting system, is the author of the monumental work The Art of Computer Programming and of many algorithms. He also has a strange sense of humour so look out for it when you watch the video.
The tradition for the Christmas Tree Lecture that is given at Stanford University in December is that is related to something new about trees learned during the year. For the first time, in the 21st of the series, the talk is nothing to do with trees and instead is about commafree codes:
"A commafree code is a set of codewords that can be read easily without spaces or other delimiters between words.
In 1965, Willard Eastman discovered a beautiful but underappreciated way to construct commafree block codes of all odd lengths, over an infinite alphabet. Professor Knuth will explain his construction and its interesting connection to questions of iteration versus recursion."
Sounds easy or obvious even but a few attempts at creating such a thing will convince you that it is tricky. A more modern term is self-synchronizing code. Basically the idea is that you don't have to send markers within a code to divide up the code words because they can be separated without such markers. That is if you take a set of words and run them together without spaces or punctuation then it should be possible to separate them again if they form a comma free code. The key idea is that it shouldn't be possible to find one of the words as the result of putting two of the words together or within another word.
For example, bear,ring, earring, ear is a very non-comma free code. Putting all the words together
makes it very clear that you cannot uniquely separate them.
picks out a code word that shouldn't be picked out.
You can also add the condition that code words are all of the same length. A binary example of this is the two bit code 00,11. In any sequence of these code words you never get a spurious code word due to the concatenation of 00 or 11 e.g. 00111100000011 can be broken into code words in only one way no matter where you start from.
If you are interested in this topic then the good news is that there is a pdf of the next pre-publication chunk of The Art of Computer Programming on Backtracking and it covers algorithms that apply to comma free codes:
or email your comment to: firstname.lastname@example.org
|Last Updated ( Thursday, 24 December 2015 )|