|Getting started with Microsoft Kinect SDK|
|Friday, 17 June 2011|
Page 2 of 3
At this point we could start with something complicated but as with all things simple is better - at first at least. So rather than digging into the details of the depth field and body skeletonization methods lets just make use of the video camera. It may not be as exciting but it does indicate the general idea of using the SDK.
Every Kinect program starts in the same general way. First you need to retrieve the collection of Kinects connected to the machine - since beta 2 there can be more than one. To do this we use the static object Runtime and its Kinects collection:
Runtime nui = Runtime.Kinects;
The Runtime object that is returned is where everything is initialized and generally set up. The first thing that you have to do is specify which sensors you are going to use in what resolutions and modes. For example to setup the video camera you would use:
In this case UseColor means use the video camera rather than the depth or skeletonization outputs.
Next you need to open the sensor so that it starts to generate data:
Most of the parameters are obvious - the 2 specifies that the Kinect will buffer two frames before dropping frames because the PC is accepting them fast enough.
At this point video data is flowing but to make use of it we need to define an event handler for the FrameReady even. As its name suggests this is called each time a video frame is ready for to be processed. Assuming that the even handling method is called FrameReady the event can be hooked up using:
nui.VideoFrameReady += new
Now we come to the most complicated part of the program as the event handler has to do something useful with the raw bitmap data that the video camera sends to it. The raw data is packaged in the event argument's
property as a byte array.
The ImageFrame.Image property also includes some useful data such as the frame number and a time stamp but most of the time it is its Image property that we make use of.
The raw data is in the form of a byte array with the image data packed in according to the format it uses. For the video sensor this takes the form of four bytes per pixel in RGBA format but with the Alpha (transparency) channel ignored.
In general working with bitmap data is made difficult by two considerations. The first is the format used for the data corresponding to each pixel and the second is the order that the data is stored in the array.
The pixel format has already been described as 32 bit RGBA. The pixels data is stored in the array in row order and as each pixel is four bytes there is no need for any padding at the end of a row. What this means is that the "stride" i.e. the number of bytes you need to read before you get to the start of the next row is simply
and the total number of bytes in the array is
and you can work out the start of the four bytes that hold the pixel at x,y in the image as:
The raw pixel data is returned from the Image property in a yet another new bitmap wrapper - the PlanarImage struct. The good news is that this isn't a complex object but a simple structure with a few useful fields such as Height, Width and BytesPerPixel.
Retrieving the raw bits is generally the first task for the event handler:
void FrameReady(object sender,
Our next job is to do something with the bit data stored in Image.Bits. What exactly you have to do depends on the task your application is carrying out and the framework you have selected.
Let's look at how to do it using Windows Forms first - as it is slightly more tricky than WPF and there aren't many examples to look at. WPF is relatively easy using the BitmapSource class and is covered later.
In the case of Windows Forms you have the Bitmap class which is generally used to manipulate and display images in say PictureBox controls or for saving to disk.
Getting the bits from the PlanarImage to a Bitmap without making any changes is a bit involved but its fairly standard "boilerplate" code that you can reuse - so lets write a function to do the job:
Bitmap PImageToBitmap(PlanarImage PImage)
You also need to add:
at the start of the program.
This function takes a PlanarImage and returns a Bitmap with the same pixel data. Notice that this only works for a PlanarImage that used 32 bit RBGA format data - which is the case for the video sensor. You can modify the function as required as you use other sensors.
First we need to create a Bitmap object capable of holding the pixel data:
Bitmap bmap = new Bitmap(
This creates a Bitmap object the same size as the PlanarImage and sets it to 32 bit RGB i.e. RGBA with the Alpha channel ignored.
The Bitmap object uses another class, the BitmapData class, to create a memory buffer to store the bit data and this makes it slightly more difficult to use than the equivalent WPF classes.
You first need to use the LockBits method return a BitMapData object which has a locked in place memory buffer containing the pixel data. In general this operation copies the bitmap data from the bit map to the buffer so you can work on it. However if you set the WriteOnly option then a memory buffer of the correct size is allocated and nothing is transferred from the Bitmap object - this is more efficient.
So to create the buffer and lock it:
BitmapData bmapdata = bmap.LockBits(
Now we have a memory buffer waiting for the bits in the PlanarImage to be transferred to it. To do this we need to use a memory-to-memory transfer provided by the InterOp services so add:
to the start of the file. First we need a pointer to the start of the memory buffer and then we us the Copy method to move the data:
IntPtr ptr = bmapdata.Scan0;
The first instruction stores the address of the start of the image buffer in ptr. The Copy method then proceeds to copy the data in PImage.Bits to the buffer that prt points at. The second parameter is an offset, usually zero and the final parameter gives the total number of bytes to copy.Now all we have to do is unlock the buffer which also transfers the data to the BitMap object and return it:
|Last Updated ( Monday, 06 February 2012 )|