From Data To Objects

Written by Alex Armstrong

Thursday, 03 January 2019

Article Index
From Data To Objects
Implementaion
Records

Page 3 of 3

For the record

Arrays are a fundamental tool but there is another… the record or “structure” which is in many senses another generalization of the array.

A record can be thought of as a mixed collection of data whereas an array composed of data all of the same type – e.g. a table of ages.

You can see the problem with the record - its Storage Mapping Function is going to be complicated because each element isn't going to be the same size. However the record isn't generally accessed by index and we need a more sophisticated SMF anyway.

The archetypal record is the name and address card.

Here you have a data structure which is composed of a name, an address and a telephone number, say, and these are three different types of data, not the same thing repeated three times as in an array.

Usually a record is defined using qualified names or fields rather than an index. For example, the record JOHN might consist of three “fields”:

JOHN.NAME=”John Doe” JOHN.ADDRESS=”1 Fortran Drive” JOHN.TELEPHONE= “12345”

You can think of this as a sort of array made up of three variables. The entire record is just called “JOHN” and you can refer to a specific field by adding the field name to the record name:

JOHN.TELEPHONE=IPROG.TELEPHONE

You can see that the record is just the computer equivalent of the old-fashioned card record that you would find, in fact do still find, in almost any office.

You may notice that you can’t simply run through all of the fields of a record like you can an array, but this doesn’t usually matter in practice because as each field is different you generally don’t want to do the same thing to each one. That is you don't often want to iterate though a record and you don't want to enumerate a record either.

Record fields are generally processed within a program one at a time by name. You can think of this as "random access only" if it helps. It is also worth saying that often the ability to process a record in sequential fashion is highly desirable and its a big pain when the language in use doesn't support it.

Records started off life as part of business-oriented languages such as Cobol but slowly they moved into mainstream general programming as ways of storing complicated items of data.

For example, instead of using two variables, x and y to store the co-ordinates of a point on the screen, it is common practice to define a record to do the same job:

point.x=10 point.y=20

This allows you to write statements such as:

point1=point2

There is also another important innovation in the way that the records were used. Before you could use a record you had to provide a definition. That is if you wanted to work with a point record you first had to declare that it had two fields - one called x and one called y both integers. Once you had the definition you could use this to create as many instances of the record as you wanted.

For example:

record pointType int x int y end

might be used to define the new record type. Then when you needed an example or an instance of the type you would write something like

pointType myPoint

and after this you can write:

myPoint.x=10

and so on. You can use pointType to create as many instances of the record as you like.

This is directly analogous to having a float or int variable type and then stamping out as many float or int variables as you needed. In other words you had to declare a record type before you could create instances of the type - this is the first place that the idea of extending the data types a language supported was introduces and it is a very important idea. It is also important that it split the use fo a custom record into two steps - first define the new record type and then create instances of it.

As long as the language that you are using makes full use of records, or structures as they tend to be called, when used in this way, this can be a very useful way of working but what you might not realise is that this simple idea leads on to probably the most important concept in 21st century computing – even if it was thought up in the 20th century!

Objects

The idea is based upon the desire to integrate structures into the language and programs as if they really were part of the original language.

For example, if you define a point structure then you might well want to define an operation of showing a point on the screen. Something like:

show point

or as a function

show(point)

In traditional programming terms this corresponds to having a command “show” which can work with the new type of data you have just introduced, i.e. the point structure. The problem is how do you add a “show” command that knows what to do with “points”.

There are a number of possible answers but the best one that we have thought up to date is to let the “point” structure know how to show itself. To do this we have to extend the idea of what a record is.

We are going to allow a record field to be a procedure or function and not just a chunk of data. In simple terms a procedure/function is a list of instructions or a small chunk of program.

For example:

point.show()

runs the small chunk of program defined as part of “point” that changes the colour of the pixel at point.x, point.y and hence “shows” it.

This idea can be elaborated so that there is no need for any code that doesn't live outside of a struct - all of the code in a program can be made part of a set of data structures.

Notice that we have changed

show(point)

into

point.show()

in the first case we have a general show function and we have to tell it which point to show. In the second we simply call the show method that belongs to the point. The show method works on the point that it belongs to. You can see that this is a very simple change in the way something is written - syntax. However thinking about things in this way is very useful.

This is a very clever idea and it was first thought of in the early 1960s by Ole-Johan Dahl and Krysten Nygaard as part of a new computer language called Simula. The language may not have caught on but the idea most certainly did.

If you haven’t recognized it then I’d better tell you that a record with procedural fields or "methods" is called “an object” and the whole idea is called “object-oriented programming”.

You might also recognize the way that the definition of the record and the instance of the record are generalized. We tend to call a record definition that has methods, i.e. fields that are code, a class and an instance of the type is and object or instance of the class.

So the record, today more commonly called a struct, is the start of not only of object oriented programming but the particular approach to it based on classes and on extending data types. This is not the only possible approach but it is the dominant one we encounter today. Objects are data that knows how to do things to itself. You may know or learn later more sophisticated justifications for object oriented programming in terms of modeling the real world as hierarchies of object but this is a much more basic reason for using objects. If you want a point to show itself - just ask it. If you want a circle to show itself - just ask it and so on.

So the truth is that objects are just records that allow code and data to be stored on an equal footing - I told you data structures were important.

Introduction Stacks And Trees

Data structures - Trees

The LIFO Stack - A Gentle Guide

Hashing

Storage Mapping Function

Advanced Hashing

Variables revisited

Stack architecture demystified

Reverse Polish Notation - RPN

Brackets are Trees

Javascript data structures - Stacks

What Programmers Know

knowcover

* Recently revised

Comments

or email your comment to: comments@i-programmer.info

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

The Monte Carlo Method

Monte Carlo methods are powerful ways of getting answers using random numbers to problems that really don't seem to have anything much to do with randomness. For example, you can find Pi and multiply [ ... ]

+ Full Article

Programmer's Guide To Theory - Splitting the Bit

Information theory – perhaps one of the most remarkable inventions of the twentieth century - naturally leads on to the consideration of how information can be coded and hence coding theory.

+ Full Article

Recent Articles

Recent Book Reviews

Popular Articles

For the record

Objects

Related Articles

What Programmers Know

Contents

Comments