Page 1 of 3
What makes the difference between a truly awful and a truly great programmer is the way that they use data. The code matters but it is almost an after thought once you have decided on the data structure. We may all want to learn to code but perhaps learning to "data" would be a better goal.
Also see: Data Structures Part II - Introduction Stacks And Trees
You could say that first you learn programming and then you graduate to “data-ing”.
Personally I think that Niklaus Wirth said it better in the title of his classic text on computer science:
Algorithms + Data structures = Programs
Here I am going to deal with two fundamental data structures- the array and the record.
The array leads on to other more interesting data structures - mainly based on the associative array or dictionary.
The record on the other hand leads on to one of the nicest ideas in programming - the object.
If you already know about objects then you might find it surprising that they spring out of not some deep theory about code but from a generalization of a simple data structure.
The most basic of data abstractions is the variable. It's an area of memory that you can store one item of data in.
Of course exactly what you mean by "one item" of data varies according to what you are doing. An integer is clearly one item of data but a string of characters?
The key idea is that for the current purpose a variable stores a chunk of data that has no structure of its own i.e. it is what is stored in the variable and what is retrieved when you use the variable.
Variables are all very well but you quickly discover that they aren’t enough.
For example if you want to store a table of numbers then you really do need something more than a single variable. The most common way of storing tables of data is to use an “array”.
This is an “indexed” collection of variables and the meaning of “indexed” will become apparent in a moment.
For example, in Basic you might write:
This creates an array with ten “elements” - A(1), A(2), A(3) and so on to A(10).
Each element is a variable in its own right and you can store a single item of data in it. For example:
stores 15 in A(3).
An array of variables
Why bother with an array?
Why not just use 10 distinct variables?
The answer is this index business.
The power of an array comes from being able to pick out which element of the array you are interested in using by way of a variable. For example:
again stores 15 in element 3 of the array.
In this case I is called the “index” and using this sort of idea you can write programs that lookup information in arrays such as “give me the 56th element” or process the entire array by changing the index as part of the working of the program. Notice that the key idea is the ability to select which variable you are referring to at run time.
That is you can write A1 or even A(1) and this will always refer to the same variable and which one id defined at compile time. In more general terms when you write A(1) in a program you can deduce which variable is being referred to at once. However A(I) is anything you like depending on the value that I has at the time.
Consider the pseudo code:
You cannot know which element of A is printed and what is more it is different each time you run the code.
Arrays bring dynamic to data.
Index and Enumerator
Today there is a concept that is very similar to the indexed array but not quite the same - the enumeration. Often indexed arrays are used in a very simply and restrictive way. They are in principle fully "random" access. That is you can get any element ot the array any time you want it and you can get the entire set of elements in any order. However if you look at the way arrays are commonly used you find that they are accessed sequentially a lot of the time:
For I from 1 to 100
This is so common that it is useful to generalize it to other sequential access data types called enumerations. All you need for an enumeration is a function that retrieves the next item, usually called the enumerator and you can write things like:
For item In A
Some times enumerations are just indexed arrays that give you an enumerator as well as an index. Sometimes they are pure enumerations that don't allow indexed access.
Enumerations make writing loops that scan though the data very easy because you can forget about the value of the index. However they do have some limitations. For example it is, depending on the language, difficult to arrange to scan though two enumerations at the same time. That is the enumeration equivalent of
For I from 1 to 100
Is often difficult to write.
Enumerations are sequential arrays and indexed arrays are random access.