Moving on with the Microsoft Kinect SDK


| July 29, 2011 | in

This summer we are hosting 23 interns at the Don’t Panic Labs office. These interns are placed into four separate teams, with each team tasked to develop a product based around a specific need. Andrew Gaspar, a member of the Moriarty team, wrote this internal blog post based on his experiences in developing a product that provides an interactive entryway for businesses.

On June 17, Microsoft released the official non-commercial Software Development Kit (SDK) for its powerful Natural User Interface (NUI) tool, Kinect. The Kinect uses an array of depth sensors, microphones, and cameras to allow the user to interact with an interface using gestures and voice commands, making for a more natural approach to interaction with content. Previously, the only official support for Microsoft Kinect development was on the Xbox 360, and required certification from Microsoft. However, the hacker community pulled through and produced unofficial drivers, as well as a number of development tools, such as OpenNI and NITE.

The Moriarty team has been researching the use of the Kinect to allow visitors of a business to easily and comfortably view information about the company. When we began work, we were using OpenNI and NITE; the official Microsoft SDK only existed as only some sort of fabled savior that was hiding just beyond the horizon. This is not to say that OpenNI and NITE were inadequate, but they did present a few usability issues.

One of the most obvious issues was NITE’s requirement for a calibration pose to create a skeleton of the user. Although the calibration pose was not difficult to perform, it didn’t create the kind of experience that we wanted our project to have. A user should easily be able to walk up to the kiosk and instantly start interacting with the content on screen. A calibration pose ruins the fluidity we are hoping to achieve. Luckily, the Microsoft SDK has more accurate and detailed skeletonization that is triggered as soon as a user walks on screen, completely cutting out the need for a calibration pose.

Another issue is the lack of support for the microphone in OpenNI. Using the official SDK, the microphone can be used in conjunction with Microsoft’s Speech APIs to allow users to interact with content using their voice. Additionally, it includes data that allows for beamforming, which is used to find the origin of a sound.

However, not all expected features of the Microsoft SDK are developed. NITE included a few built-in gestures, namely swipes and pushes, which made it easy to create simple controls using these events. If we want to use these same events in the Kinect SDK, we’ll need to define the gestures ourselves based on the skeletal data.

With other benefits that include detailed documentation and more accurate depth readings, the switch to the official SDK was a no-brainer. Although we cannot use the development kit for anything beyond research, it allows us to prepare for release of an inevitable commercially available SDK, which will hopefully, and likely, have feature parity with the non-commercial release.

Compare these two videos. One uses OpenNI and NITE while the one of Matt and I uses the Kinect SDK. The depth tracking offered by the official Kinect SDK is much more accurate, and it keeps track of players better. Additionally, there is higher fidelity in the number of joints and how well it tracks those joints.

Kinect SDK