|Learning Regular Expressions|
Author: Ben Forta
In principle. regular expressions are not connected with any particular language or application, but in practice they are. There are lots of small variations in how things are implemented in different languages, especially when you consider extensions to the basic syntax of the regular expression. This book attempts to teach you the grammar of the regular expression without committing to a particular implementation. This is a reasonable thing to attempt, but only if you keep to the core of the grammar. The real question is, can this be enough to let you actually use regular expressions in the real world?
Chapter 1 starts off with a general look at the idea of regular expressions and what they are used for. It couldn't be any more simple or straightforward. So, on to Chapter 2 which is where we first meet the grammar of the regular expression. It too is simple, and couldn't be expressed any more simply, and here we might hit a potential snag. Is it possible to be too simple? One of the first examples is how to search for "Ben" in a longer text. The regular expression for "Ben" is "Ben". This is displayed in a table layout and it is only as clear as intended if you don't go looking for more complexity.
A little later we have a slightly different problem when the metacharacter "." is introduced. In this case, the list of examples are all file names and they all include a dot between the name and the extension. Is this dot the same as that dot? Does the first dot match a dot and so on. Interestingly the chapter goes on to explain all of this in very simple terms and you are left with the impression that the use of text with dots and regular expressions with dots was intentional and a teaching tool. Personally, I think if it was an intentional way of making a point (pun intended) then it is a dangerous one. I would have preferred an example without a dot in the text to be searched and then an example that had one. Clarity first, complexity second.
Don't be too put off by this, however, because these problems become fewer as the book tackles more complex material.
Chapter 3 is about matching sets of characters and Chapter 4 extends this to metacharacters for groups of characters that are otherwise difficult to define - e.g. whitespace.
Chapter 5 is where things get slightly tricker with repeats one or more, zero or more, zero or one and intervals. Chapter 6 gets a bit easier with position matching, but things get really "interesting" at Chapter 7 with the introduction of subexpression. This is where regular expressions started to take on a life of their own. Chapters 8 and 9 and 10 deal with the most sophisticated parts of the regular expression with back references, look ahead and behind and conditions. Here it is neccesary to take note of how different languages implement the ideas, but it isn't as disruptive to the flow as I feared.
Chapter 11 presents a collection of common problems and their solution using regular expressions - ZIP codes, URLs, email addresses and so on.
So what is the overall verdict?
The book presents regular expressions in a pure form and it does it as simply as possible in a formulaic way that more-or-less works. This is not a book for the over-achieving computer science expert as it is simply too slow and doesn't appeal to higher level simplifications. If you are a beginner and have struggled with regular expressions this might be the book that does it for you. However, this said, there is no avoiding the fact that regular expressions look increasingly complicated as you progress. I'm not sure that you can make something as densly symbolic as a regular expression simple in the way that this book attempts to. There is also the small problem, for some, that there are no wider examples - no string handling functions or any programs showing regular expression fitting in with other language features. There can't be as the book is as language-neutral as it is possible to be.
If you are a beginner and want to take the challenge of understanding regular expressions, then this might well be the best place to start.
|Last Updated ( Saturday, 02 February 2019 )|