Author: Terence Parr
Publisher: Pragmatic Bookshelf
Aimed at: Language implementers
Pros: Explains syntax and semantics in practice
Cons: Emphasis on DSLs
Reviewed by: Mike James
This book takes a very traditional syntax and parsing approach to language creation. Is this a workable approach?
The subtitle of this book is Create Your Own Domain-Specific and General Programming Languages and the mention of Domain Specific Languages (DSLs) is probably a bit of a red herring. DSLs were, and maybe still are, one of the over-hyped ideas introduced because the developer world seemed to have run out of new and trendy acronyms that promised to solve all current and future problems.
DSLs are occasionally useful but mostly overkill. Even if you think that DSLs are a great idea and you want to get involved in creating one then this book probably isn't the way to go. DSLs need extensive new tools that make their construction easy - such as Microsoft's Visualization and Modeling SDK. This book takes a very traditional syntax and parsing approach to language creation - and if this is what you want it happens to be very good indeed.
If you have ever done a course on compiler theory or syntax you might well have failed to see the relevance of much of the deep and complex theory of formal languages to the actual problem of implementing a real computer language. (See Grammar and Torture for an overview.)
What this book does is cover more or less the same ground as a traditional course or that of the best known book on the topic - the so called "Dragon" book because of its cover illustration - Compilers: Principles, Techniques, and Tools (see sidebar) - but it does it in a readable and practically-oriented way.
The only minor problems with the book are that it keeps on making references to DSLs as motivation, which I found irrelevant, and it uses the ANTLR parser, which is fine as long as you want to use the ANTLR parser.
The book is divided into four sections. The first, "Getting started with parsing", is an introduction to the problem that you are going to be solving. It introduces phrase structured grammars very slowly and gently - all the while making the connection with why you should be interested in this theory.
Chapter 2 gets as far as LL(k) recursive descent parsing and succeeds in making it seem like the obvious way of tackling the problem. The really good thing about this part of the book is that, even though it's on syntax, it manages to make and keep the connection between syntax and semantics, i.e. code or behaviour generation, clear.
Part Two is on Analysing Languages and with this title it might be expected to get somewhat abstract. The good news is that it remains practical and code-based. In this section we learn about constructing syntax trees using ANTLR, walking and rewriting trees, managing symbols and symbol tables and managing types. If you have encountered any of this material in courses or books that left you baffled as why you were bothering then try again with this book and understand what it's all about.
Part Three is on building interpreters and it it continues to be practical with a discussion of both high level and bytecode interpreters. You don't get details of any particular bytecode interpreter or how to use say the Java Virtual Machine. This is a section that deals with the different ways that interpreters can be implemented - stack based versus register based for example.
The final part is about translating and generating languages and if the book is about DSLs then this is the section that delivers on that promise. What it is really about is finding some small languages to give examples of the previous sections' ideas in action. We see a language for building 3D scenes, XML, configuration files, adding new type to Java and so on. This section makes a nice rounding off to the book.
The need to introduce languages into your applications isn't something that is common but it is a lot of fun. Even a little knowledge of syntax and grammar can allow you to think about some tasks - like processing your applications' configuration files - in a different way. You often don't need a full parser. Just a regular expression facility and a few recursive functions will often allow you to create something more advanced than a simple name value pair type system. Of course if you are interested in implementing your own language or getting involved in an existing language project then you simply have to read this book.
If you want to know about the implementation of computer languages then buy and read a copy of this book - it's great fun!