Knuth's 21st Not Christmas Tree Lecture
Thursday, 24 December 2015

This year's annual lecture by Donald Knuth lecture is a keeper. It is on a topic that will be understandable even if you don't know a lot about it - commafree codes. 

Donald Knuth should be known to you but just in case - he invented the TeX computer typesetting system, is the author of the monumental work The Art of Computer Programming and of many algorithms. He also has a strange sense of humour so look out for it when you watch the video.  

The tradition for the Christmas Tree Lecture that is given at Stanford University in December is that is related to something new about trees learned during the year. For the first time, in the 21st of the series, the talk is nothing to do with trees and instead is about commafree codes:

"A commafree code is a set of codewords that can be read easily without spaces or other delimiters between words.

In 1965, Willard Eastman discovered a beautiful but underappreciated way to construct commafree block codes of all odd lengths, over an infinite alphabet. Professor Knuth will explain his construction and its interesting connection to questions of iteration versus recursion."

Sounds easy or obvious even but a few attempts at creating such a thing will convince you that it is tricky. A more modern term is self-synchronizing code. Basically the idea is that you don't have to send markers within a code to divide up the code words because they can be separated without such markers. That is if you take a set of words and run them together without spaces or punctuation then it should be possible to separate them again if they form a comma free code. The key idea is that it shouldn't be possible to find one of the words as the result of putting two of the words together or within another word. 

For example, bear,ring, earring, ear is a very non-comma free code. Putting all the words together 


makes it very clear that you cannot uniquely separate them.

For example 


picks out a code word that shouldn't be picked out.

You can also add the condition that code words are all of the same length. A binary example of this is the two bit code 00,11. In any sequence of these code words you never get a spurious code word due to the concatenation of 00 or 11 e.g.  00111100000011 can be broken into code words in only one way no matter where you start from.



 If you are interested in this topic then the good news is that there is a pdf of the next pre-publication chunk of The Art of Computer Programming on Backtracking and it covers algorithms that apply to comma free codes:

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking




More Information

Computer Musings 

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking (pdf)

Related Articles

Donald Knuth's Christmas Tree Lecture 

Donald Knuth and the Art of Programming     

Another Chunk of The Art of Computer Programming 

How not to shuffle - the Knuth Fisher-Yates algorithm 


To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin



Apache Updates Geronimo Arthur

Apache Geronimo Arthur has been updated with support for Common-compress, XBean, and ensures the default options are compatible with last GraalVM release.

Can C++ Be As Safe As Rust?

Herb Sutter is a well known and respected C++ champion and he thinks that the language only needs a few tweaks to make it as safe as Rust. Can this be true?

More News


raspberry pi books



or email your comment to:




Last Updated ( Thursday, 24 December 2015 )