Now let's look at these ideas when applied to functions.

If you define a function which accepts an object of type A as its input parameter and returns a derived object of type B as its output

i.e. f(A)->B and A>B.

Then, by the substitution principle or just because a derived class B has all of the properties of A, you can use a B as the input and treat the return a B as the result. In other words f(B) is legal.

That is if you have a function:

B MyFunction(A param){ return new B }

you can write

A MyA=MyFunction(new B);

That is if A>B the input parameter A can be a B because it has everything an A does. But by the usual rules the B that the function returns as its result can be treated as an A because a B has everything an A has. So on input an A can be replaced by a subclass and on output a B can be replaced by a superclass.

So it seems that input parameters are contravariant and output parameters i.e. results are covariant.

To see that this is so we have to shift our viewpoint a little.

Let's think about this from the point of view of the function for a moment. Consider two functions that just differ in the type of their input parameters with A>B:

By the substitution principle MyFunction1 can accept a B as its input and so can be used anywhere you use a MyFunction2.

Hence by the reverse substitution principle you have to regard MyFunction1 as derived from MyFunction2 and

MyFunction1<MyFunction2

That is changing the input parameter type from A to B where A>B results in MyFunction1<MyFunction2

You can now see clearly why this is contravariance. Changing the parameter from type A to type B in the function definition where A>B results in MyFunction1<MyFunction2 - the resulting type ordering goes the other way.

Now lets repeat the argument but with two functions that only differ by their output type where A>B:

A MyFunction1(){ return new A; } B MyFunction2(){ return new B; }

In this case the by the substitution principle it is clear that MyFunction2 can be used anywhere MyFunction1 can, because it returns a B which can always be treated as an A, and hence MyFunction2 has to be considered as derived from MyFunction1 i.e.

MyFunction1>MyFunction2.

Thus A>B results in MyFunction1>MyFunction2.

You can see that changing the output type from A to B i.e. going towards more derived makes the new function more derived and the change is covariant.

So changing input parameter type to be more derived makes the function less derived i.e. contravariant change and changing the output type to be more derived makes the function more derived i.e. covariant change.

The general principle

Now that you have looked at the way that a change to a function effects its type we can generalise the idea of covariance and contravariance to any situation - not just where functions are involved.

Suppose we have two types A and B and we have a modification, or transformation T, that we can make to both of them to give new types T(A) and T(B).

If T is a covariant transformation we have A>B implies T(A)>T(B).

If T is a contravariant transformation then we have A>B implies T(A)<T(B).

It is also possible that neither relationship applies that is A>B doesn't imply anything about the relationship between T(A) and T(B). In this case T is referred to as invariant - which isn't really a good name.

Our earlier function example can be recast into this form by inventing the transformation T that converts a type into a function with a parameter of that type. As already worked out T is clearly contravariant.

That is, T(A)=function(A) and T(B) = function(B), where function(A) is any function returning any type and accepting a parameter of type A and, as we have demonstrated in the previous section, A>B implies:

T(A)=function(A)<function(B)=T(B)

In the same way a transformation that converts a type into a function that returns the type is covariant.

That is, T(A)=A function() and T(B)=B function, where A function is any function of any parameter type that returns a type A, and B function is any function of any parameter types that returns a type B, then A>B implies:

T(A)=A function()>B function=T(B)

This is a completely general idea.

For example consider the transformation that converts a type into an array of that type i.e. T(A) is and array of type A or A[]. If A>B an array of B can be substituted for an array of A and so A[]>B[]. Hence forming an array is covariant.

What's the point?

Now you understand the idea of covariance and contravariance you might be wondering what the point is?

The answer is that it is all a matter of when a language, or a language implementation, should, or could, allow automatic type conversion.

As array construction is covariant, if T(B) constructs an array of type B, it is fairly safe to treat an array of B as being an array of A without needing an explicit type conversion.

In many languages you can indeed write:

string[] x=new string[]; object[] y=a;

without the need for a cast and this is because object>string implies object[]>string[].

For a more involved example consider the C# delegate.

A delegate is a type that can wrap a function with a specific signature. A delegate accepts a function as input parameter and returns a delegate type that wraps it.

Consider a delegate

delegate Mydelegate(functionB);

and two functions which only differ in the type of their input parameter then A>B implies by contravariance:

functionA(A param)<functionB(B param);

Hence functionA can be used anywhere functionB can and so it should be safe to use Mydelegate to wrap an instance of a function that has parameters that are less derived than it.

By a similar argument a delegate:

delegate Mydelegate(A function);

which wraps a function which returns an A should be quite safe wrapping a function that returns a B as A>B implies by covariance that A function>B function and so you can use a B function anywhere you can use an A function. Thus delegates are covariance in the return parameter of the function they wrap and so can safely wrap a function that returns a more derived result.

Is it worth it?

Is it worth introducing what appear to be complex ideas like covariance and contravariance?

In practical work probably not as the whole thing is usually simple enough to work out from first principles.

For example, can an array of strings be used in place of an array of objects?

Well yes obviously as a string is just an object with additional properties and methods so it can always be treated as an object.

So if you want to just think about type conversion and substitution rules from first principles that's fine. If you want to dress the idea up in the terms covariance and contravariance that's fine too.

However, the academic terms do have the disadvantage that it's easy to miss practical concerns.

Languages have to be pragmatic and this means they don't always obey the substitution principle.

For example a double can always be used where an integer can be used - e.g. write 1.0 in place of 1 but in practice it it rare for a language to define a double as a derived type of int.

For another example, consider the array of string types. If this is cast to an array of object types then this works for all read access but write access often fails. For example, try

in most languages this will fail because the underlying type of the object element is string and not object but by the substitution principle it should work. In this case it is a result of inheritance not being implemented fully, i.e. an array of strings isn't really an array of objects with some additional properties.

In practice programming is more complicated and messy that pure theory allows.

Look out for a follow-on article that explains covariance and contravariance in C# in more detail.

Making parallel code easier to use is an important development, but making it easier also means you can use it without realising what you are getting into. Unless we can find a way of making parallel [ ... ]

XML, which is all about tree structured data, and Linq, which is all about querying collections, might not seem to fit together, but they work together just fine.