Arrays are a fundamental tool but there is another… the record or “structure”.
A record can be thought of as a mixed collection of data whereas an array composed of data all of the same type – e.g. a table of ages.
You can see the problem with the record - its Storage Mapping Function is going to be complicated because each element isn't going to be the same size. However the record isn't generally accessed by index and we need a more sophisticated SMF anyway.
The archetypal record is the name and address card.
Here you have a data structure which is composed of a name, an address and a telephone number, say, and these are three different types of data, not the same thing repeated three times as in an array.
Usually a record is defined using qualified names or fields rather than an index. For example, the record JOHN might consist of three “fields”:
You can think of this as a sort of array made up of three variables. The entire record is just called “JOHN” and you can refer to a specific field by adding the field name to the record name:
You can see that the record is just the computer equivalent of the old-fashioned card record that you would find, in fact do still find, in almost any office.
You may notice that you can’t simply run through all of the fields of a record like you can an array, but this doesn’t usually matter in practice because as each field is different you generally don’t want to do the same thing to each one. That is you don't often want to iterate though a record and you don't want to enumerate a record either.
Record fields are generally processed within a program one at a time by name. You can think of this as "random access only" if it helps. It is also worth saying that often the ability to process a record in sequential fashion is highly desirable and its a big pain when the language in use doesn't support it.
Records started off life as part of business-oriented languages such as Cobol but slowly they moved into mainstream general programming as ways of storing complicated items of data.
For example, instead of using two variables, x and y to store the co-ordinates of a point on the screen, it is common practice to define a record to do the same job:
This allows you to write statements such as:
There is also another important innovation in the way that the records were used. Before you could use a record you had to provide a definition. That is if you wanted to work with a point record you first had to declare that it had two fields - one called x and one called y both integers. Once you had the definition you could use this to create as many instances of the record as you wanted.
record pointType int x int y end
might be used to define the new record type. Then when you needed an example or an instance of the type you would write something like
and after this you can write:
and so on. You can use pointType to create as many instances of the record as you like.
This is directly analogous to having a float or int variable type and then stamping out as many float or int variables as you needed. In other words you had to declare a record type before you could create instances of the type - this is the first place that the idea of extending the data types a language supported was introduces and it is a very important idea. It is also important that it split the use fo a custom record into two steps - first define the new record type and then create instances of it.
As long as the language that you are using makes full use of records, or structures as they tend to be called, when used in this way, this can be a very useful way of working but what you might not realise is that this simple idea leads on to probably the most important concept in 21st century computing – even if it was thought up in the 20th century!
The idea is based upon the desire to integrate structures into the language and programs as if they really were part of the original language.
For example, if you define a point structure then you might well want to define an operation of showing a point on the screen. Something like:
In traditional programming terms this corresponds to having a command “show” which can work with the new type of data you have just introduced, i.e. the point structure. The problem is how do you add a “show” command that knows what to do with “points”.
There are a number of possible answers but the best one that we have thought up to date is to let the “point” structure know how to show itself. To do this we have to extend the idea of what a record is.
We are going to allow a record field to be a procedure and not just a chunk of data. In simple terms a procedure is a list of instructions or a small chunk of program.
runs the small chunk of program defined as part of “point” that changes the colour of the pixel at point.x, point.y and hence “shows” it.
This idea can be elaborated so that there is no need for any code that doesn't live outside of a struct - all of the code in a program can be made part of a set of data structures.
This is a very clever idea and it was first thought of in the early 1960s by Ole-Johan Dahl and Krysten Nygaard as part of a new computer language called Simula. The language may not have caught on but the idea most certainly did.
If you haven’t recognised it then I’d better tell you that a record with procedural fields or "method" is called “an object” and the whole idea is called “object-oriented programming”.
You might also recognize the way that the definition of the record and the instance of the record are generalized. We tend to call a record definition that has methods, i.e. fields that are code, a class and an instance of the type is and object or instance fo the class.
So the record, today more commonly called a struct, is the start of not only of object oriented programming but the particular approach to it based on classes and on extending data types. This is not the only possible approach but it is the dominant one we encounter today.
So the truth is that objects are just records that allow code and data to be stored on an equal footing - I told you data structures were important.
The merge sort is an under-appreciated algorithm - yet it is neat, clever and it still has its uses. With the rise of big data, parallel methods and online processing, you can even argue that it is gr [ ... ]