Demystifying Pivot Tables
Written by Janet Swift   
Friday, 20 August 2010
Article Index
Demystifying Pivot Tables
Pivot tables
Using the PivotGrid control
Running the program

Pivot tables are often considered to be complicated and difficult to use - but they correspond to a simple idea and a simple way of working with data. Find out what a pivot table is and how to implement one in WPF.



Pivot tables are really useful tools for presenting and exploring data but they have an unfair reputation for being complicated. Indeed many users hear the term "pivot table" but never quite manage to find out what it is all about.

The idea, however, is very simple and you probably already have lots of data that are ideal for a pivot table. Let's start with a brief explanation of exactly what the key ideas are.

Almost any data that you collect in a spreadsheet or a database is categorised, usually on multiple categories. For example, you might start to keep a spreadsheet of profits made by three stores A, B and C. Then you might well record the data using something like:


Nothing special so far but now suppose that the Profit figures correspond to different types of sales - mobile phones, mp3 players and videos. How would you now record the data? One way is to simply add another column and record the type of each profit:



If you think about it for a moment it should be easy to see that an alternative way of entering the same data is to use a column for Store and a Row for Type:



This form of recording the data is often referred to as a cross tabulation or a cross tab for short. The first form is so standard that it hardly has a name but when it is necessary to make a distinction between it and a crosstab it is usually called a one-dimensional form or a flat form. You can see that the cross tab has two dimensions - Store down the column and Type across the rows.

So far so good. Now what happens if we add another category?

Suppose the stores are in either the north or the south of the sales territory. The flat version of the data is easy enough to generalise - simply add another column:


What about the cross tab or multi-dimensional form? The key is in the "multi-dimensional". In this case we have three categories - Store, Type of profit and Region. You can use these to create a cube with each cell labeled by one category from each of the "dimensions". Of course you can't easily construct a spreadsheet cube and even if you could adding another set of categories would force you to need a four dimensional cube and so on. Even though you can't imagine a 4-D cube, a hypercube, the principles are the same.

If you have n categorical measurements, i.e n categorical dimensions, each data value can be imagined to be in a cell labeled by one category value from each of the dimensions. For example, the top row of the spreadsheet:



can be interpreted as saying that the data value or measure Profit=500 should be in a cell in a 3D cube labeled by Store=A, Type=Mobile and Region=N.

This sort of interpretation works for any number of dimensions without having to imagine n-dimensional hypercubes.






Last Updated ( Friday, 20 August 2010 )