|Mozilla Wants Your Voice|
|Written by Lucy Black|
|Thursday, 20 July 2017|
Mozilla has launched Project Common Voice to crowdsource speech recognition. Once a massive amount of audio data has been captured, it will be made available for others to use in their own applications.
The rationale behind Project Common Voice is that a great deal of data is required for any type of machine learning. In the case of training a speech to text system around 10,000 hours is required and this is the target. Employing crowdsourcing will enable Mozilla to make voice recognition technology accessible to developers to use.
Explaining why the project is important, Mozilla states:
Voice is natural, voice is human. It’s the easiest and most natural way to communicate. With Common Voice, developers can build amazing things––from real-time translators to voice-enabled administrative assistants. But the data they need to build these apps isn’t publicly available. Common Voice will give them what they need to innovate.
Much of the recent revolution in AI has been due to the Internet providing huge databases of labeled data that allows neural networks to be trained. Without a database of speech snippets, complete with an accurate text transcription, training a neural network to do speech-to-text would be impossible. Constructing such a database has, till now, need the resources of big companies like Google, Amazon, Microsoft and Apple. Mozilla's approach, by contrast, is to rely on all of us.
The project relies on donations - but this time it's your voice and your listening skills it requires. To take part you'll need a system with microphone and speakers and allow Mozilla access to them. There's an iOS app and in our desktop tests the web app worked seamlessly with Firefox, refused to work with Edge, sometimes worked with Chrome and worked frustratingly slowly on Android.
You can recognise when Common Voice is responsive by colour changes:
The app asks you to speak three sentences and then gives you a chance to review them before submitting your recordings.
The other way to contribute to the project is to validate sentences recorded by others, confirming that what you hear corresponds to the text. Don't expect to hear perfect audio, on the contrary:
We want the audio quality to reflect the audio quality a speech-to-text engine will see in the wild. Thus, we want variety. This teaches the speech-to-text engine to handle various situations—background talking, car noise, fan noise—without errors
It's interesting to hear the variety of accents recorded by others and this acts as a prompt to complete you own profile which asks for your accent, gender and age range.
As this project is open source there is also the opportunity to get involved in its future development and support Mozilla's mission of which the Common Voice project is the latest component:
Mozilla is dedicated to keeping the web open and accessible for everyone. To do it we need to empower web creators through projects like Common Voice. As voice technologies proliferate beyond niche applications, we believe they must serve all users equally well. We see a need to include more languages, accents and demographics when building and testing voice technologies. Mozilla wants to see a healthy, vibrant internet. That means giving new creators access to voice data so they can build new, extraordinary projects. Common Voice will be a public resource that will help Mozilla teams and developers around the world.
The Common Voice app is fun to engage with, and it's nice to know that far from frittering away your time you are contributing to a worthwhile resource.
or email your comment to: firstname.lastname@example.org
|Last Updated ( Thursday, 20 July 2017 )|