Page 1 of 5
LINQ may look like SQL but this just a convenience so that LINQ to SQL can be used by people too lazy to learn new syntax.
You need to think of LINQ as an attempt to bring “querying” type operations into the mainstream of the .NET language of your choice. The point is that LINQ isn’t really about database, this just happens to be a really good use of it.
It’s a general purpose and extensible approach to working with structured data. In a more general setting you can think of it as yet another aspect of functional programming that is making its way into .NET.
In this article I’m going to demonstrate how LINQ works at the deepest level. When you understand how LINQ is implemented you will be better able to use it, how to extend it and you can’t but help admire its simplicity.
A Query is basically an operation where you specify a subset of the data that you actually want to work with. In many ways you can think of “querying” as the point where software creation becomes complicated and the real world enters the design problem. Put simply - data is messy, usually not organised in a way that suits and the task is always more difficult and fragile than you could possibly imagine.
LINQ can’t do anything about the inherent complexity of querying data but it does deliver it in a uniform and integrated format.
The key idea is that LINQ is a set of classes and methods that will work with any class as a data source as long as it implements the IEnumerable<T> interface.
As, in a sense, this is the foundation on which the rest of LINQ is built, this is a good place to start. In fact it is probably a good idea to take one step further back and look at the whole idea of enumerators.
Implementing an enumerator
The basic idea of an enumerator is to supply each item in a collection of data items one-by-one – and usually in no specified order.
In .NET an enumerator is a class that provides a number of methods in addition to the basic one-by-one enumeration of the items. To be specific an enumerator has to supply:
- Reset –initialises the enumeration so that it starts over again
- Current – returns the current item
- MoveNext – updates the index to the next item
Of course this all implies that there is a numeric index to the current item which starts off set to –1 to indicate that it “points” before the start of the collection.
Calling Current repeatedly is allowed but if the index is invalid then you are supposed to throw an exception. MoveNext returns true if the result is a valid index and false if the resulting index isn’t pointing to a valid item.
Put together these three methods make up the IEnumerator interface and any class that supports enumeration does so by implementing this interface.
You don’t have to use a separate class to implement IEnumerator, you can do the job in the same class that implements the inner workings of the data collection if that’s convenient. It is more usual to create the enumerator as a separate class and write a constructor that creates a instance of the enumerator ready to be use.
A non-generic simple enumerator
To see the simplest possible example of an enumerator, let’s create everything in a single class.
Our example data collection is also going to be a little strange in that no collection of data ever exists. Instead when the collection is instantiated the constructor is supplied with the size of the collection and from then on a random number generator is used each time a data item is required. This is clearly not very useful for anything other than testing hence the name of the class:
Generating random data also has the advantage that the example doesn’t use any of the existing collection data types which all supply enumerators and hence tend to confuse the issue. TestCollection doesn’t make use of anything prebuilt in the .NET framework to implement its enumerator.
To make it work you need to add:
To get us started we need some private variables, something to hold the instance of the random number generator, something to store the size of the collection and, of course an index to the current position in the collection:
private Random rnd;
private int Size;
private int loc = -1;
We need the constructor to set everything up ready for the collection to be enumerated:
public TestCollection(int s)
rnd = new Random();
Size = s;
Now we have to implement the methods of the IEnumerator interface. The reset method is simply:
this.loc = -1;
You don’t really need the “this” but it helps to emphasis the fact that the enumerator works with the instance.
The Current method takes the form of a read only property. :
if (this.loc > -1)
Notice that Current returns an object rather than an int. This is how it has to be as the interface defines the Current method in this way. Ideally we would like to return a result of a specified type but without generics this is difficult. More about this problem later as to use LINQ we have no choice but to use a generic ennumerator.
In principle we should test to make sure loc is sensible, i.e. actually indexes an element of the collection and throws an exception if it doesn’t. In this case we simply return –1. In most cases whatever is using the enumerator usually stops enumeration when the MoveNext method returns false to indicate that there is no next item:
if (loc < this.Size)
Now we have the complete enumerator and it’s very easy to see how it works but we can’t as yet make use of it.