Research at MIT has resulted in the implementation of a system that can take a description of a task in standard English and churn out the code to do the job. Is this the end of programming?
Of course not.
Every now and again one research team or another manages to analyze some small sub-domain of English and devises a way to translate it into some formal language or another. Recently there has been a tendency to create more specialized formal languages - i.e. Domain Specific Languages - that apply to areas where they seem more like a natural language, for the task in question at least. The point is that there is a lot of regularity in natural language that you can use to translate it to more formal languages as long as the area of discourse is small enough.
This is all interesting and valuable work but the problem is that people overestimate its importance and hail it as a breakthrough with headlines like, recursion intended, "Who Needs Programming Languages" and "Bye bye Codeacademy".
Recent papers from MIT's Computer Science and Artificial Intelligence Laboratory demonstrate two general approaches.
The first presented was in June at the annual Conference of the North American Chapter of the Association for Computational Linguistics, In it, Regina Barzilay, associate professor of computer science and electrical engineering at MIT CSAIL, and her graduate student Nate Kushman used examples harvested from the Web to train a computer system to convert natural-language descriptions into regular expressions.
As you probably know regular expressions have a reputation for difficult syntax and sometimes hard-to-comprehend semantics. The point here is that in many cases the actual idea for the match is easy to express in natural language. For example, find a five letter word starting with "B" and ending in "ic" can be converted into a much more complex-looking regular expression. That this can be done automatically is not so much a shock as the innocent might expect. The conversion is achieved by generating a suitable regular expression corresponding to the English and then reducing it to the best most compact form. At the paper's presentation, the audience was asked to implement regular expressions for a fairly simple text search. Only a small number of people got the "best" answer as proposed by the algorithm.
The second paper was presented at the Association for Computational Linguistics’ annual conference in August. Barzilay and another of her graduate students, Tao Lei, teamed up with professor of electrical engineering and computer science Martin Rinard and his graduate student, Fan Long, to describe a system that automatically learns how to handle data stored in different file formats, based on specifications prepared for a popular programming competition.
The system can create an input parser from a specification for the file format written in natural, but precise, language. They tested it on 100 file formats specified as part of the Association for Computing Machinery’s International Collegiate Programming Contest. The system was able to produce input parsers for about 80 percent of the specifications. In the remaining cases, changing just a word or two of the specification usually yielded a working parser.
So is this important?
To quote from the MIT press release:
“This is a big first step toward allowing everyday users to program their computers without requiring any knowledge of programming language,” says Luke Zettlemoyer, an assistant professor of computer science and engineering at the University of Washington. “The techniques they have developed should definitely generalize to other related programming tasks.”
However, I wouldn't give up learning to program just yet because the great step forward really hasn't happened. Both studies are focused on a sub-class of grammar called a regular grammar. As you might guess a regular grammar is what specifies a regular expression; it also can be used to specify a file format. Indeed, a file input parser is just another example of a regular expression. However, natural language needs more than a regular grammar - a context sensitive grammar or better - to express even a part of its complexity.
What both of these excellent pieces of research seem to prove is that if you restrict natural language to expressing things that can be specified by a regular grammar then it is possible to convert the natural language to a more formal language governed by another regular grammar.
So we can all breathe a sigh of relief because general programming needs a Chomsky Type-0 grammar and not a Type-3 regular grammar.
So the headline for this news should have read:
"A natural language specification of something that is governed by a regular grammar can be converted into a formal language governed by a regular grammar."
which is still interesting, but would you have read it?
A finite state machine is described by a regular grammar and isn't Turing complete.