Mozilla Wants Your Voice

Written by Lucy Black

Thursday, 20 July 2017

Mozilla has launched Project Common Voice to crowdsource speech recognition. Once a massive amount of audio data has been captured, it will be made available for others to use in their own applications.

ffcvbanner

The rationale behind Project Common Voice is that a great deal of data is required for any type of machine learning. In the case of training a speech to text system around 10,000 hours is required and this is the target. Employing crowdsourcing will enable Mozilla to make voice recognition technology accessible to developers to use.

Explaining why the project is important, Mozilla states:

Voice is natural, voice is human. It’s the easiest and most natural way to communicate. With Common Voice, developers can build amazing things––from real-time translators to voice-enabled administrative assistants. But the data they need to build these apps isn’t publicly available. Common Voice will give them what they need to innovate.

Much of the recent revolution in AI has been due to the Internet providing huge databases of labeled data that allows neural networks to be trained. Without a database of speech snippets, complete with an accurate text transcription, training a neural network to do speech-to-text would be impossible. Constructing such a database has, till now, need the resources of big companies like Google, Amazon, Microsoft and Apple. Mozilla's approach, by contrast, is to rely on all of us.

The project relies on donations - but this time it's your voice and your listening skills it requires. To take part you'll need a system with microphone and speakers and allow Mozilla access to them. There's an iOS app and in our desktop tests the web app worked seamlessly with Firefox, refused to work with Edge, sometimes worked with Chrome and worked frustratingly slowly on Android.

You can recognise when Common Voice is responsive by colour changes:

commonvoice2

The app asks you to speak three sentences and then gives you a chance to review them before submitting your recordings.

cvsubmit

The other way to contribute to the project is to validate sentences recorded by others, confirming that what you hear corresponds to the text. Don't expect to hear perfect audio, on the contrary:

We want the audio quality to reflect the audio quality a speech-to-text engine will see in the wild. Thus, we want variety. This teaches the speech-to-text engine to handle various situations—background talking, car noise, fan noise—without errors

It's interesting to hear the variety of accents recorded by others and this acts as a prompt to complete you own profile which asks for your accent, gender and age range.

As this project is open source there is also the opportunity to get involved in its future development and support Mozilla's mission of which the Common Voice project is the latest component:

Mozilla is dedicated to keeping the web open and accessible for everyone. To do it we need to empower web creators through projects like Common Voice. As voice technologies proliferate beyond niche applications, we believe they must serve all users equally well. We see a need to include more languages, accents and demographics when building and testing voice technologies. Mozilla wants to see a healthy, vibrant internet. That means giving new creators access to voice data so they can build new, extraordinary projects. Common Voice will be a public resource that will help Mozilla teams and developers around the world.

The Common Voice app is fun to engage with, and it's nice to know that far from frittering away your time you are contributing to a worthwhile resource.

More Information

Common Voice

Project Common Voice by Mozilla - iOS app

Mozilla Looks Into Health of Internet

Mozilla Open Source Support Program

Open Badges from Mozilla

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.

PNG Gets First Update In Over Twenty Years
07/07/2025

PNG, the Portable Network Graphics specification, has been updated to add support for HDR (High Dynamic Range) images and for animated PNGs.

+ Full Story

Google Introduces Gemini CLI Open-Source Agent
08/07/2025

Google is introducing Gemini CLI, an open-source AI agent that offers lightweight access to Gemini, Google's conversational chatbot that is based on Google's multimodal large language model [ ... ]

+ Full Story

More News

Comments

or email your comment to: comments@i-programmer.info

Last Updated ( Thursday, 20 July 2017 )

Recent Articles

Recent Book Reviews

Popular Articles

More Information

Related Articles

Comments