|Data Typing Is A Relic
|Written by Ian Elliot
|Friday, 08 February 2013
Page 1 of 2
Most modern languages that are thought to be "respectable" are examples of the same approach - strongly typed class-based languages. This could be the single biggest mistake in the history of programming.
Strong data typing is generally thought to be not just a good thing but probably the best way to program. Data typing and object oriented programming seem to go together naturally and they reinforce each other in both practical and theoretical ways.
These ideas have been at the core of modern programming since objects were introduced to C to create C++ and on to languages like Java and C#.
However, we may just have grown too accustomed to a single way of doing things.
The class-based strong typing mono-culture of Java, C++, C# and so on might just be an aberration.
The whole issue of data typing started very early on in the history of computing. Primitive data types were forced on us by the hardware. Most languages implemented a range of primitive data types and we extended this idea to class-based hierarchies without really evaluating the alternatives. As a result something primitive has been elevated to the status of high theory and this makes it difficult to challenge.
Even if you finally don't agree with my argument you at least should think about it carefully.
First we look at the reason for primitive data typing.
Let's consider for a moment the historical roots of data typing and see how it evolved into the more sophisticated idea of the strongly typed class hierarchy.
Back in the dark ages of assembler and Fortran programmers lived nearer to the bits. If you wanted to store something in memory you needed to care how it was stored. You needed to know about fixed point, floating point, signed integers, characters and perhaps even strings - although strings were a little a sophisticated for the time. This naturally resulted in programmers needing to know about data types and using operators that were specific to particular types.
A little later programming languages such as Basic appeared that attempted to make programing seem easy and natural. Have you ever been confronted by a beginner with the task of explaining was is wrong with something like:
were the text property is a string say. The beginner has a lot of trouble trying to see the difference between "123" and 123.
They both "look" like numbers but one is a string and one is an integer. The difference between 123 and 123.0 is another one that causes problems in the same way but it doesn't raise its head in practical situations quite as often.
If you try and learn a language like C that is still close to the bits then the situation is a lot worse with byte, short, int and so on and even high level languages like Java that are supposed to have grown up and moved away from the hardware still have similar data types.
We need to think about this from an "ideal world" point of view for a moment.
The reason we have primitive data types is because it makes life simpler for the language implementer. As programmers we don't really want to get involved in detailed representation of data. Any such considerations are about efficiency and not about design - when efficiency is a consideration use a lower level language like C.
For general programming what we want is the highest level possible language that abstracts away from the primitive hardware. Such a language should just work with the data as appropriate. A truly advanced language would just let the programmer write
and it would automatically perform what ever tricks were needed to store the data. Variables would not be typed and type specific operators would simply force conversions as required - and all under the covers.
In a modern language the way that data is handled should depend on the operations applied to it and not the assumption of a particular representation.
Class Type Hierarchies
This is where things get complicated and it is where it is most difficult to evaluated what we do with an unbiased view.
If you start out with the idea that variables are typed then when you move to object oriented programs it becomes clear that each different sort of object is a data type. This is made even clearer by the way classes are used to create objects in the same way that primitive type names are used to create primitive types. Yes. everything is an object, but we know it isn't so.
When we write
it is made clear that myClass is a type just like int.
Class is the raw material of the type system of most modern languages.
The next logical step is to look at the way classes relate to one another.
If we use inheritance to create new derived classes then clearly there is a simple relation ship between the base and derived class. The derived class has, by inheritance everything that the base class has so in terms of the type system the derived class is a sub-type. The rule, formalized as Liskov's substitution principle, is that any derived class can be used in place of a base class - because a derived class is also an example of the base class possibly plus some new properties.
What all this leads to is the class-based type hierarchy that we have all grown to accept as the right and perhaps only way to do things.
The class hierarchy allows us to use strong typing to make sure that we aren't trying to store a string in an int and a base class in a derived class variable.
Great! We can catch type errors at compile time.
Of course without strong typing there would be no pure type errors.
The issue isn't type but applying the appropriate operations to the data. When you write "123" * "2" you are clearly asking for a multiplication and the data should be treated as numeric. As the data can be converted to numeric then this should be done automatically. However a strongly typed language will throw a type error at this point even though all that is missing is a conversion operator.
If the data doesn't have the form of a number e.g. "ABC" * "D" then you have a real "type" error that can't be solved by a conversion or a cast. Strong typing catches this sort of error as well as the previous one and this is a real error.
|Last Updated ( Saturday, 09 February 2013 )