Deep C# - Value And Reference
Written by Mike James   
Thursday, 27 August 2015
Article Index
Deep C# - Value And Reference
Thinking About References
Arrays

Thinking About References

What you should have in mind is a picture that a value type stores its value and a reference type stores a “pointer” to its value.

For example,

int a;

This declares and creates an integer variable which in common with all value types isn't initialized to a sensible value but for the sake of a simple explaination let's assume it is set to zero. C# enforces the rule that you can’t make use of an uninitialised value type but never the less the integer variable exists and it ready to store something.

fig1

However if you declare a reference type, e.g. a class:

class Point
{
 public int x,y
}

you can then create a reference variable of the same type:

Point b;

This declares a reference type b which can reference an object of the type Point but at the moment no such object exists and the reference is set to its default value null.

fig2

 

This way of thinking has a nice tidy symetry even if it is spoiled by C#'s insistance on not letting you access an undefined variable - which is very resonable.

To create a Point object we need the additional step:

b=new Point();

Now we have a Point object created on the heap and b is set to reference or point at it. 

fig3

Notice that the reference variable b is just like the value variable a in that they are both stored on the stack and both store immediate values - the difference is that a’s value is the data and b’s value is a reference to the data.

Of course we often combine these two steps together to create the familiar idiom:

Point b=new Point();

This often seems to the beginner as redundant because of the way it uses “Point” twice.

The first use of Point declares a reference to a point object, i.e. b, and the “new Point” part actually creates the point object. It doesn’t take long for this to seem so familiar that you don’t give it a second thought.

Another important difference is that an object can correspond to multiple reference variables. For example:

Point b=new Point();
Point c = b;

This creates a single Point object but two reference variables both of which “point” at the same object.

fig4

Lifetimes

It is often said that an important difference between value and reference types is the their life cycle.

In fact both types of variable have exactly the same behaviour with respect to when they are created and destroyed.

That is a value or a reference type is destroyed as soon as the variable is clearly no longer accessible - i.e. out of scope.

This means, for example, that a variable defined in a method is destroyed as soon as the function terminates. It is this behaviour that makes local variables truly local to the method or block that they were declared in. Notice that there can be exceptions to this rule such as static variables with aren’t destroyed until the application terminates. However it is true to say that the vast majority of variables do behave in this way.

What is different between value and reference types is what happens to their data when the variable is destroyed.

In the case of a value type variable the variable and its data are one and the same and so when a value type variable is destroyed so is its data.

However a reference type variable only contains a reference to its data and while the variable and the reference it contains is destroyed - the object that it references isn’t.

This is the source of the statement that value and reference variables have different lifetimes - they don’t but the data associated with them can have.

Obviously we can’t leave unwanted objects on the heap forever and this is where the system garbage collector comes in. This is a service that periodically scans the heap looking for object that no longer have any references to them.

An object with no references too it is clearly no longer required and using this fact the garbage collector eventually gets round to clearing up the heap.

Notice that this difference in lifetime is entirely to do with the way that things are stored. The value and reference variables are stored on the stack and this is naturally self managing in the sense that when a method returns all of its local variables are destroyed by the adjustment of the stack pointer. Anything stored on the heap has no such natural cleaning process and we have to implement a garbage collection system to determine when they are no longer required and when they should be removed.

How and when to tidy the heap is entirely a matter of efficiency - garbage collect too often and you use up processor power when the is plenty of heap waiting to be used. Garbage collect too little and you risk bringing the application to a halt while the garbage collector has to work overtime freeing up memory by deleting objects and consolidating free space.

Structs and classes

Although value types are often introduced as “simple” types such as int or float all value types are really just examples of the struct.

The simple value types are structs but they are also treated differently to avoid the overheads a genuine struct brings with it to make sure that your program runs efficiently.

The fact that an int is a struct really only has an impact on your programs because this means that int inherits from object a set of simple standard methods.

For example, it is perfectly ok to write:

int a;
string b=a.ToString();

In fact int is just an alias for the System.Int32 struct. You could write

System.Int32 a;

in place of int a but it is usual not to. We will return to the issue of simple data types as objects later in this chapter because there is a little more to it.

So it is reasonable to say that the most important division in C# type system is the split into classes and structs (both descended from object). And the really big difference between the two is that a class is a reference type whereas a struct is a value type.

fig5

 

Structs Are From Value And Classes Are From Reference

In many cases you have the choice of implementing something as either a class or a struct. For example consider a simple type designed to store the x,y coordinates of a point. You can do this as a class:

Class PointR
{
 public int x,y;
}

or as a struct:

Struct PointV
{
 public int x,y;
}

Notice that the class is named with a trailing R for Reference and the struct with a trailing V for value.

As already stated the most important difference is due to the fact that a struct is a value type and a class is a reference type.

That is the class behaves as described earlier for general reference types and struct behaves like a general value type - but lets take a look at this more closely because a struct and a class look much more alike than say an int and a class and mistakes and misunderstandings are easier to make.

The most immediate impact of this difference is that you don’t have to use “new” when creating an instance of a struct that is you can create an instance of PointV using:

PointV a;

and this immediately creates a PointV object which you can use:

a.x=10

The similar class however needs “new” to create an instance and:

PointR b;

only creates a reference variable. To make use of a PointR object you also have to use:

b=new PointR();
b.x=20;

To make the difference even clearer you can create other references to the same PointR object as in:

PointR c;
c=b;

Now c and b refer to the same PointR object and the same x value is changed by c.x=30 or b.x=30. In the case of a struct, and a PointV in particular, you cannot create multiple references to it and assignment creates a copy of the struct.

That is,

PointV d;
d=a;

makes an independent copy of the struct a. Now assigning to d.x changes a different x to assigning to a.x.

 

Banner



Last Updated ( Wednesday, 18 November 2015 )