Knuth's 21st Not Christmas Tree Lecture
Thursday, 24 December 2015

This year's annual lecture by Donald Knuth lecture is a keeper. It is on a topic that will be understandable even if you don't know a lot about it - commafree codes.

Donald Knuth should be known to you but just in case - he invented the TeX computer typesetting system, is the author of the monumental work The Art of Computer Programming and of many algorithms. He also has a strange sense of humour so look out for it when you watch the video.

The tradition for the Christmas Tree Lecture that is given at Stanford University in December is that is related to something new about trees learned during the year. For the first time, in the 21st of the series, the talk is nothing to do with trees and instead is about commafree codes:

"A commafree code is a set of codewords that can be read easily without spaces or other delimiters between words.

In 1965, Willard Eastman discovered a beautiful but underappreciated way to construct commafree block codes of all odd lengths, over an infinite alphabet. Professor Knuth will explain his construction and its interesting connection to questions of iteration versus recursion."

Sounds easy or obvious even but a few attempts at creating such a thing will convince you that it is tricky. A more modern term is self-synchronizing code. Basically the idea is that you don't have to send markers within a code to divide up the code words because they can be separated without such markers. That is if you take a set of words and run them together without spaces or punctuation then it should be possible to separate them again if they form a comma free code. The key idea is that it shouldn't be possible to find one of the words as the result of putting two of the words together or within another word.

For example, bear,ring, earring, ear is a very non-comma free code. Putting all the words together

bearringearringear

makes it very clear that you cannot uniquely separate them.

For example

bearringearringear

picks out a code word that shouldn't be picked out.

You can also add the condition that code words are all of the same length. A binary example of this is the two bit code 00,11. In any sequence of these code words you never get a spurious code word due to the concatenation of 00 or 11 e.g.  00111100000011 can be broken into code words in only one way no matter where you start from.

If you are interested in this topic then the good news is that there is a pdf of the next pre-publication chunk of The Art of Computer Programming on Backtracking and it covers algorithms that apply to comma free codes:

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking

Computer Musings

#### Related Articles

Donald Knuth's Christmas Tree Lecture

Donald Knuth and the Art of Programming

Another Chunk of The Art of Computer Programming

How not to shuffle - the Knuth Fisher-Yates algorithm

 GitHub Announces AI-Powered Changes09/11/2023GitHub has announced changes to its platform that will use AI "in every step of the developer lifecycle". The intention is to make natural language become the universal programming language. The annou [ ... ] + Full Story GameMaker Free For Non-Commercial Use30/11/2023GameMaker, for creating 2D platform games and now part of the Opera family, has made a change to its prices and terms and it is good news. GameMaker is now free for non-commercial purposes on all [ ... ] + Full Story More News