Mozilla Updates Voice Recognition Project
Written by Kay Ewbank   
Tuesday, 14 July 2020

Mozilla has released an updated dataset for its Common Voice project, along with a major update to its DeepSpeech speech-to-text and text-to-speech engines.

Mozilla's Common Voice project aims to provide a free database of recordings of people speaking sample sentences. The database is open-source and free for anyone can use. DeepSpeech is an automatic speech recognition (ASR) engine that aims to make speech recognition technology and trained models openly available to developers. DeepSpeech is a deep learning-based ASR engine with a simple API, and a collection of pre-trained English models.

deepspeech

The Common Voice project now has an updated dataset with 7,226 total hours of contributed voice data, of which 5,591 of have been confirmed as valid. The release comprises over 5.5million clips, with over 5,000 unique speakers. The new release includes voice recordings in 54 languages, of which 14 are new to the platform and dataset. The update also now has voice data for single words including the digits zero through nine, as well as the words yes, no, hey and Firefox. The team says this single word datawill help Mozilla benchmark the accuracy of its open source voice recognition engine, Deep Speech 243, in multiple languages for a similar task.

DeepSpeech has also been upgraded, and the developers say it now offers faster speech recognition and support for Google’s TensorFlow Lite framework. DeepSpeech consists of two main subsystems: an acoustic model and a decoder. The acoustic model is a deep neural network that receives audio features as inputs, and outputs character probabilities. The decoder uses a beam search algorithm to transform the character probabilities into textual transcripts that are then returned by the system.

deepspeech 

More Information

Common Voice Website

DeepSpeech On GitHub

Related Articles

Introducing DeepSpeech

Mozilla Wants Your Voice

Mozilla DeepSpeech Gets Smaller

Mozilla Labs Quietly Relaunched 

Adversarial Attacks On Voice Input

The State Of Voice As UI

 

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on, Twitter, Facebook or Linkedin.

Banner


Now Perl 6 Is Raku, Perl 5 Can Be 7
06/07/2020

After Perl 6's renaming to Raku, acknowledging that it really is another language, Perl can now use number 7 without fear. It already has claimed the newly freed territory with the announcement that P [ ... ]



Python Tops IEEE Spectrum's Rankings For Fourth Time
27/07/2020

IEEE Spectrum has an interactive app that ranks the popularity of dozens of programming languages, well 55 to be precise. It has just published its rankings for 2020, the 7th year of this ex [ ... ]


More News

graphics

 



 

Comments




or email your comment to: comments@i-programmer.info