Microsoft Translator API
Written by Sue Gee   
Saturday, 02 April 2016

Microsoft has released a new version of its Translator API. This provides developers with the same speech-to-speech facilities as those used in the Skype Translator and in the iOS and Android Microsoft Translator apps.

mstranbanner

The blog post announcing the availability of the new Microsoft Translator API Microsoft describes it as:

the first end-to-end speech translation solution optimized for real-life conversations (vs. simple human to machine commands) available on the market. 

It also explains how it works using AI technologies, such as deep neural networks for speech recognition and text translation and outlines the following four stages for performing speech translation.

  1. Automatic Speech Recognition (ASR) — A deep neural network trained on thousands of hours of audio analyzes incoming speech. This model is trained on human-to-human interactions rather than human-to-machine commands, producing speech recognition that is optimized for normal conversations.

  2. TrueText — A Microsoft Research innovation, TrueText takes the literal text and transforms it to more closely reflect user intent. It achieves this by removing speech disfluencies, such as “um”s and “ah”s, as well as stutters and repetitions. The text is also made more readable and translatable by adding sentence breaks, proper punctuation and capitalization. (see picture below)

  3. Translation — The text is translated into any of the 50+ languages supported by Microsoft Translator. The eight speech languages have been further optimized for conversations by training on millions of words of conversational data using deep neural networks powered language models.

  4. Text to Speech — If the target language is one of the eighteen speech languages supported, the text is converted into speech output using speech synthesis. This stage is omitted in speech-to-text translation scenarios such as video subtitling. 

mstransqhow

(click to enlarge)

Microsoft Translator covers two types of API use and integration:

1) Speech-to-speech translation is available for English, French, German, Italian, Portuguese, Spanish, Chinese Mandarin and  Arabic.

2) Speech-to-text translation, for scenarios such as webcasts or BI analysis, allows developers to translate any of these eight supported conversation translation languages into any of the supported 50+ text languages.

A two-hour free trial is available. This provides 7,200 transactions where a transaction is equivalent to 1 second of audio input and is the same as the free monthly tier. Beyond this subscriptions are are available: 

 

mstranlatorprices

The prospect of being able to communicate without language barriers is becoming ever more a reality and the more we use it the better the facility will become. Ironically there's a error in the sample Microsoft uses in its artwork above - Gurdeep is the object of the final sentence in the English and becomes the subject in the French. This sort of error will quickly be corrected by machine learning as more data becomes available.

mstransq

Banner


Dart Adds WebAssembly Support
20/02/2024

Google has released Dart 3.3 with experimental support for applications compiled to WebAssembly, along with new extension types and a revamped JavaScript interop model.



Couchbase's Coding Assistant Goes GA
11/03/2024

Capella iQ, the AI coding assistant for developers that makes interacting with Couchbase using natural language possible, has gone from private beta to being generally available.


More News

 

raspberry pi books

 

Comments




or email your comment to: comments@i-programmer.info

Last Updated ( Saturday, 02 April 2016 )