Page 1 of 3
Kinect provides two levels of processed data - a skeleton map which gives you the position of the player's limbs and a user index which can be used to discover the player's overall position. In this chapter of our ebook on using the Kinect SDK for Windows we take a close look at the player index data.
Practical Windows Kinect in C#
- Introduction to Kinect
- Getting started with Microsoft Kinect SDK 1
- Using the Depth Sensor
- The Player Index
- Depth and Video Space
- The Full Skeleton
- A 3D Point Cloud
In Chapter 3 we looked at how to work with the raw depth data. This is often useful when you are trying to do something new but when you just need to know the position of a player in the scene it is easier to let the Kinect take the strain and use its skeleton tracking.
Kinect has two types of raw output - standard color video and the depth field which tells you how far away each pixel is.
In the main it is the depth field which is the exciting part of the Kinect and as well as raw data the Kinect makes available two processed data streams derived from the depth data. The first is the user index and the second is the skeletonization data.
Although the skeleton data is perhaps the most impressive the user index is often more useful. While the skeleton data will give you the position of various parts of the body the user index gives you, after a little processing, the area of the view that each user is currently ocupying. This can be used as a general location finder or it can be used as a mask to process the video image.
The Kinect is very clever and processes the depth data to locate human body like objects. It does this not so much by recognizing a whole human body but by recognizing limbs and placing the joints to build up a skeleton.
As already explained this skeleton provides a lot of information but the Kinect also makes use of it to detect the whole user and it labels each pixel with a number from 1 to 7 to indicate that it is part of user 1, 2, and so on up to 7. If a pixel isn't part of the image of a user then it is assigned index zero.
So far the theory is interesting and you should be able to see what you can use the user index data for so let's get started. As with most Kinect tasks most of the effort goes into moving the data about and getting it into the correct format to be usable.
Make sure to read at least chapter two before you continue to find out how to get started. This chapter also builds on the ideas in chapter three.
To work with the user index data you need to set the Kinect up to perform the skeletonization procedure as well as returning raw depth information. In this first example we will use a Windows forms application but the start of the program is the same for a WPF application.
As always you need add a reference to
to the start of the code.
The user index is derived from the skeleton data so you can't have it without turning on the skeleton data. The program starts off in the usual way with the creation of a KinectSensor object:
sensor = KinectSensor.KinectSensors
Next you have to enable the depth stream in the usual way and also the Skeleton stream as we want the player index data - even though you don't actaully want to use the Skeleton stream data.
As we only want the depth data stream we only attach an interrupt handler to the DepthFrameReady event:
sensor.DepthFrameReady += DepthFrameReady;
and start the data streams running:
You can also turn on the video camera but it isn't going to be used in this simple example.
So far everything has been the same as working with the basic depth data as described in the previous chapter apart from enabling the Skeleton stream.
We now need define an interrupt handler that will be called when a frame of depth data is ready:
void DepthFrameReady(object sender,
From this point we can access the depth data as a short array as in the previous chapter:
if (imageFrame != null)
short pixelData =
Now we have the data in pixelData and are ready to process it. The only difference is that now the low order three bits of the 16 bit number are no longer set to zero but are set to the player index.
So now each element of the array carries two pieces of data - the depth stored in the top most 13 bits and the player index stored in the bottom three bits.