[Spoken Breakout PIC] Kinect Chapter 14.   Speech Recognition


This chapter is not included in the book.


When OpenNI is compared to Microsoft's SDK for the Kinect, two 'drawbacks' are often mentioned a lack of control over the Kinect's tilt motor, and being unable to access the four channel microphone array. I tackled the first issue in chapter 6 when I implemented a simple Java class and driver for the motor, accelerometer, and LED. This chapter and the next are about two different approaches to adding speech recognition and microphone capture to OpenNI.

This chapter focuses on speech recognition, using CloudGarden's TalkingJava SDK, recorded with the PC's microphone rather than the Kinect's. The picture on the right shows the user holding a microphone in their left hand, using it to verbally control the Breakout game. This game is a speech-enabled version of the application first described in chapter 9.

TalkingJava offers a full implementation of Sun's Java Speech API (JSAPI) for recognition and synthesis, and utilizes Microsoft's Speech API (SAPI) speech engines beneath Java. SAPI is a standard feature on Windows, and comes with a range of simple configuration and testing tools. In particular, I can improve speech recognition accuracy by training the engine to deal specifically with my voice and microphone setup. SAPI is also at the heart of Microsoft's SDK for the Kinect.

Chapter 15 shows how to directly record from the Kinect's microphone array with Java's Sound API. That chapter's 'trick' is to install audio support from Microsoft's SDK, which can co-exist with OpenNI. The SDK's audio driver lets Windows 7 treat the array as a multichannel recording device, making it accessible to Java.




Dr. Andrew Davison
E-mail: ad@fivedots.coe.psu.ac.th
Back to my home page