|The Working Programmer's Guide To Variables - Scope, Lifetime And More|
|Written by Harry Fairhead|
|Tuesday, 24 June 2014|
Page 1 of 3
Many programmers are confused by the range of variations that there are on the humble variable - mainly because the idea is so basic that we just "pick it up" as we go along. This explanation doesn't cover all of the possibilities but enough of them for you to understand the rest.
When you first start programming a variable is introduced as something that you store data in until you need it again.
Later on a variable becomes something much more complicated with static, dynamic, local and global modifiers that can be used as part of the declaration - not to mention the complications of type.
Thinking about how variables can be implemented leads on to a more general topic of names and how they are bound to the things that they represent. This is an important topic that is central to object oriented programming.
In this introduction to slightly more advanced variables the topic is restricted to idea of when a name is valid and what exactly it means. This topic is often referred to as "scope" but the terminology used to describe variables and how they behave isn't 100% standard so its all the more important to understand the ideas.
The variable makes us different
The basic idea of a variable is one of the great ideas of computing and it is what distinguishes it from mathematics.
In computing a variable is a name bound to an area of storage.
The principle method of manipulating a variable is the assignment statement and the expression.
For example, in many languages you would write an assignment as
and it immediately becomes clear why variables and assignment statements are not like mathematical variables and equations.
If you write X=X+Y in math then the only conclusion you can draw is that Y must be zero!
In math there is no assignment (unless you study one of those areas where it is introduced specifically to explore the sorts of things that go on in programming).
The difference is sufficient to encourage languages such as Pascal, Algol and Ada and more to introduce a special assignment operation that doesn't look like the mathematical equals sign.
is an attempt to make it clear that this is an instruction to store the result of adding the "contents" of Total and Sum back into Total.
Notice that the real difference between math and computing is that the assignment statement has time built into its interpretation. The "new value is the old value plus sum".
If you find this obvious you need to recall that functional programming is an attempt to remove or at least lessen the split between math and programming. In functional programming assignment is a one time operation - you can't reassign to the same variable and instructions like a=a+1 are illegal. Yes it can be done - but many think it's artificial and simply hiding the fact that programming is different.
This description of a variable and assignment is more or less what every programmer ends up understanding in a practical sense.
It is important to think of a variable as a name that is bound to an entity. A variable is not the same thing as its value as its value can change. A variable is a more abstract idea that is best understood as a name that labels or is bound to some storage.
If you work with variables you can't help but think about them as named units of storage even if you don't know how the underlying hardware works with them. However, as you carry on learning and programming you slowly start to realise that variables are slightly more complex.
For example, if you use a name in more than one unit of code be it called a module/function/subroutine or whatever - is it the same variable and if you leave a module/function/subroutine do all of the variables carry on existing and will they have the same values when you return?
These may sound like questions on an exam paper in existential philosophy but an understanding of how variables behave is vital to designing programs that work.
The most important characteristic of a variable is the scope of its name.
Scope refers to the section of the program that a variable name can be relied upon to label the same unit of storage.
In more technical language the scope is the part of the program were the name is bound to the same entity.
There is also the subtle point of what we mean by "part of the program". The simplest definition is some unit of code that is marked out in the program's text - i.e. block of text, a function or a module say. This corresponds to the most common definition of scope - lexical scope.
For example, if you use a variable called A in a function and a variable called A in another function- it is clear that they share the same names but do they share the same values?
If the scope of the name A includes both functions then they do share the same values. If the scope of each name is restricted to the individual functions then they are different variables that just happen to have the same name.
Again more technically you could say that the names are the same but the bindings are different.
One of the problems at this point in the discussion is that some programmers will never have encountered the idea of scope because many languages try and make things as simple as possible - so simple that the idea never occurs beyond the basics.
For example in most scripting languages the natural way to work is to have every variable that you define available for use throughout the program. In this case the variables are said to be "global" and the scope of a global variable is the entire program. In this case scope is an almost trivial concept.
Globals and locals
Programming using nothing but global variables is fine until you start writing large programs and using functions or other types of module.
Once you break a large program down into functions or methods then the pay off is that you can write each as if it was a completely separate program.
Well you can as long as every variable that you use isn't a global variable.
If you have to worry about what names you can use for a variable while writing a function, because the name might have already been used in another function, then it is as bad as writing the program in one chunk.
For example, most programmers tend to use i and j for loop indices - to understand why you need to go back to the days of Fortran. If every variable is global then it's only a matter of time before an i or a j used in one function collides with their use in another function. This sort of name collision is the reason we don't like global variables.
The solution is to use local variables.
Local variables, are confined in scope to the function or method that declares them. For example, if you define Local A in one function and Local A in another function then the A in the first function has nothing to do with the A in the second function- they are different variables that just happen to share the same name.
If you are building a modular program then all of the variables within each module should be Local. How then do the modules communicate?
The answer is only by passing data using parameters.
Parameters are another story and to avoid getting side tracked I will assume that everyone knows what a parameter is and how they work. Parameters have a lot of devious traps waiting to get the careless programmer, but for the purpose of talking about variables these will also be ignored.
So what are global variables for if every variable in a module should be local to that module and if data is to be passed using parameters?
One answer is that global variables aren't for anything they are just a leftover from the days when we didn't know better.
An alternative answer is that passing every item of data that a module needs is boring, error prone and results in procedure calls the length of War and Peace.
Rather than pass every variable to a module it makes good sense to define the major items of data as global and allow every program to access them using the same name. After all if you have an array called TOTALS it is usually called TOTALS in all of the procedures that use it even if it is passed in as a parameter! In this sense the array is being treated as if it was a global variable even though it is local to each procedure.
The difficulty with this approach is how to draw the dividing line between what should be global and local. At its extreme it results in everything that might be passed as a parameter being declared as global and every procedure call consists of just the name of the procedure!
In other words, using local variables results in long parameter lists and using global variable results in no parameter lists.
Both approaches have their problems but today the emphasis falls on the need to decouple the modules that make up a program and hence local is always better than global.
In short - avoid global variables.
|Last Updated ( Wednesday, 25 June 2014 )|