On day two of MIX Microsoft has given some information on the promised official Kinect SDK - and yes, as anticipated, it does seem to include the new super body tracking software able to track up to two users at the same time and it also promises a new feature - the ability to listen.
The Kinect for Windows Beta SDK should be available in the spring and it seems it will include a version of the body tracking program developed by Microsoft Research. This is the main feature missing from alternative SDKs made available by the open source community and by PrimeSense, the company behind the Kinect hardware.
The PrimeSense sponsored SDK does include a body tracking application but it is based on the older technique of starting from an initial configuration and tracking individual movements. This works well but if the body location is lost it can take a long time to relocate and for tracking to continue. It also finds it hard to track multiple people or to adjust when another person enters the field of view.
The Microsoft body tracker as used in the Kinect software is based on an AI learning approach and it doesn't need the user to adopt an initial known starting position. It also refinds the location of the body quickly if the location is lost for any reason. The SDK is claimed to have a tracking facility that will track one or two people at the same time.
In addition the SDK supports the usual depth map data and also the audio sensors in the hardware. The audio is a feature of the Kinect that alternative SDKs more or less ignore. The Microsoft SDK promises to use the four-element microphone array to perform noise and echo cancellation and integrate with the existing Windows speech recognition API. Speech recognition might push Kinect application into new areas but it is worth noting that the same thing could be achieved simply by plugging a microphone into the PC running the Kinect with a non-official SDK. Although the use of the microphone array coupled with position sensing could significantly improve the performance of the recognition accuracy. It is the machine version of the cocktail party effect where you can tune into a particular speaker because you know where to listen rather than what to listen to. The idea is to create speech recognition that works without any special preparation and at a 4m distance with a user free to move around the room.
It might just be that this improved performance is enough to make it a practical success*.
Given the range of things that have been achieved using the unofficial API, the prospect of a robust multi-body tracking system can only be good news. However, it all depends on what the licencing arrangements are. As this is aimed at the experimenter it is likely that there will be restrictions on use and particularly on redistribution.
Developers can sign up to be notified of the release at http://research.microsoft.com/kinectsdk.
The SDK beta for hobbyists will be out on May 16.
For more information on the speech recognition project:
Getting Started with Kinect
All About Kinect
*Thanks to Jim McCraken for pointing out the improved efficiency of the array microphone with location sensing.
If you would like to be informed about new articles on I Programmer you can either follow us on Twitter, on Facebook or you can subscribe to our weekly newsletter.