Knuth's 21st Not Christmas Tree Lecture
Thursday, 24 December 2015

This year's annual lecture by Donald Knuth lecture is a keeper. It is on a topic that will be understandable even if you don't know a lot about it - commafree codes. 

Donald Knuth should be known to you but just in case - he invented the TeX computer typesetting system, is the author of the monumental work The Art of Computer Programming and of many algorithms. He also has a strange sense of humour so look out for it when you watch the video.  

The tradition for the Christmas Tree Lecture that is given at Stanford University in December is that is related to something new about trees learned during the year. For the first time, in the 21st of the series, the talk is nothing to do with trees and instead is about commafree codes:

"A commafree code is a set of codewords that can be read easily without spaces or other delimiters between words.

In 1965, Willard Eastman discovered a beautiful but underappreciated way to construct commafree block codes of all odd lengths, over an infinite alphabet. Professor Knuth will explain his construction and its interesting connection to questions of iteration versus recursion."

Sounds easy or obvious even but a few attempts at creating such a thing will convince you that it is tricky. A more modern term is self-synchronizing code. Basically the idea is that you don't have to send markers within a code to divide up the code words because they can be separated without such markers. That is if you take a set of words and run them together without spaces or punctuation then it should be possible to separate them again if they form a comma free code. The key idea is that it shouldn't be possible to find one of the words as the result of putting two of the words together or within another word. 

For example, bear,ring, earring, ear is a very non-comma free code. Putting all the words together 

bearringearringear 

makes it very clear that you cannot uniquely separate them.

For example 

bearringearringear 

picks out a code word that shouldn't be picked out.

You can also add the condition that code words are all of the same length. A binary example of this is the two bit code 00,11. In any sequence of these code words you never get a spurious code word due to the concatenation of 00 or 11 e.g.  00111100000011 can be broken into code words in only one way no matter where you start from.

 

 

 If you are interested in this topic then the good news is that there is a pdf of the next pre-publication chunk of The Art of Computer Programming on Backtracking and it covers algorithms that apply to comma free codes:

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking

 

 

stanfordlogoforknuth

More Information

Computer Musings 

The Art of Computer Programming: Volume 4, Pre-fascicle 5B: Introduction to Backtracking (pdf)

Related Articles

Donald Knuth's Christmas Tree Lecture 

Donald Knuth and the Art of Programming     

Another Chunk of The Art of Computer Programming 

How not to shuffle - the Knuth Fisher-Yates algorithm 

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, FacebookGoogle+ or Linkedin

 

Banner


TIOBE Says C Is Declining!
11/09/2024

The TIOBE index is always fun and often provides a talking point. This month sees C falling by one place and Java going up by one. What can this mean?



OSI Updates Open Source AI Definition
06/09/2024

A new version of the Open Source AI Definition has been released by the Open Source Initiative. The update is part of the drive to have a stable release of the definition ready by the end of October 2 [ ... ]


More News

 

kotlin book

 

Comments




or email your comment to: comments@i-programmer.info

 

<ASIN:0201558025>

<ASIN:0321751043>

Last Updated ( Thursday, 24 December 2015 )