The W3C Speech API - A Google Initiative

Written by Alex Armstrong

Monday, 05 November 2012

A Web Speech API Specification has recently been published together with a call for Final Specification Commitments from members of the W3C Speech API Community Group.

The specification is for a JavaScript API that will enable web developers to incorporate scripts into their web pages that can generate text-to-speech output and can use speech recognition as an input for forms, continuous dictation and control.

The HTML Speech Incubator Group was originally formed in August 2010 with initiating members from Microsoft, Google, Voxeo, AT&T, Mozilla and OpenStream. Proposals for API specifications were made by Google and by Microsoft.

It produced a HTML Speech Incubator Group Final Report in December 2011 outlining the use cases developed by the group and requirements ordered by priority of interest of the group members It also contained a preliminary proposal for a JavaScript API and associated HTML bindings.

This diagram from the report outlines what items would be in and out of scope for the final solution to the task the group had begun:

(click to enlarge)

Within a matter of two weeks of this report Google came up with a proposal for a Speech JavaScript API Specification that supported 15 of the 17 use cases defined in the HTML Speech Incubator Group Final Report.

Voice Web Search
Speech Command Interface
Domain Specific Grammars Contingent on Earlier Inputs
Continuous Recognition of Open Dialog
Domain Specific Grammars Filling Multiple Input Fields
Speech UI present when no visible UI need be present
Voice Activity Detection
Hello World
Speech Translation
Speech Enabled Email Client
Dialog Systems
Multimodal Interaction
Speech Driving Directions
Multimodal Video Game
Multimodal Search

The remaining two were omitted to keep the API to a minimum:

Re-recognition
Temporal Structure of Synthesis to Provide Visual Feedback

A Speech API Community Group was formed in April 2012 to continue work on this specification. It is chaired by Glen Shires from Google who one of the editors of the draft Speech API, and has five other Google member plus representatives of W3C, the World Wide Web Foundation, Mozilla, OpenReach and some others. Its Web Speech API Specification has been edited by Glen Shires and Hans Wennborg also from Google.

At the moment the API specification doesn't have the status of a W3C Standard nor is it on the W3C Standards Track. So far only the Google member of the Speech API Community Group have committed to the Web Speech Specification. Chrome is the only browser to have the speech API - let's hope the other's follow and we have a standard rather than a mess.

More Information

Web Speech API Specification

Speech API Community Group

Speech JavaScript API Specification

HTML Speech Incubator Group Final Report

W3C Announce HTML5 To Be Ready Nearly A Decade Early

Which HTML5? - WHATWG and W3C Split

W3C Publishes Push API Draft Specification

Web Platform Docs - A Unified Resource for Web Developers

Comments

or email your comment to: comments@i-programmer.info

To be informed about new articles on I Programmer, install the I Programmer Toolbar, subscribe to the RSS feed, follow us on, Twitter, Facebook, Google+ or Linkedin, or sign up for our weekly newsletter.

Interact With Virtual Historic Computers
14/04/2024

Alan Turing's ACE computer is a legendary computer that is particularly special for I Programmer - our account of it was the first ever history article on the site when it launched in 2009. Now this i [ ... ]

+ Full Story

JetBrains Updates IDEs With AI Code Completion
04/04/2024

JetBrains has launched the first set of updates for 2024 of its JetBrains IDEs. The new versions include full-line code autocompletion powered by locally run AI models.

+ Full Story

More News

Last Updated ( Monday, 05 November 2012 )

More Information

Related Articles

Comments