Researchers Use AI To Decode Dog Language
Written by Lucy Black   
Friday, 21 June 2024

Scientists from the University of Michigan have used AI to decode what dogs mean by different types of bark. Wav2Vec2 succeeded at four classification tasks - dog recognition, breed identification, gender classification, and context grounding.

A paper presented at the Joint International Conference on Computational Linguistics, Language Resources and Evaluation, explained how AI models trained on human speech were used to classify dog barks.

The team, who collaborated with Mexico’s National Institute of Astrophysics, Optics and Electronics (INAOE) Institute in Puebla, said that the first obstacle they faced in developing AI models that can analyze animal vocalizations was the lack of publicly available data.


Artem Abzaliev, lead author and U-M doctoral student in computer science and engineering, said the problem was much harder than collecting human speech:

“Animal vocalizations are logistically much harder to solicit and record. They must be passively recorded in the wild or, in the case of domestic pets, with the permission of owners.”

Using the models trained on human speech overcome part of this problem, according to Rada Mihalcea, the Janice M. Jenkins Collegiate Professor of Computer Science and Engineering, and director of U-M's AI Laboratory:

“By using speech processing models initially trained on human speech, our research opens a new window into how we can leverage what we built so far in speech processing to start understanding the nuances of dog barks.”

Taking this approach meant the researchers could make use of existing robust models developed for voice-enabled technologies including voice-to-text and language translation. These models are trained to distinguish nuances in human speech, like tone, pitch and accent, and convert this information into a format that can be analysed by a computer.

The researchers used a dataset of dog vocalizations recorded from 74 dogs of varying breed, age and sex, in a variety of contexts. Humberto Pérez-Espinosa, a collaborator at INAOE, led the team who collected the dataset. Abzaliev then used the recordings to modify a machine-learning model. The team chose a speech representation model called Wav2Vec2, which was originally trained on human speech data.

With this model, the researchers were able to generate representations of the acoustic data collected from the dogs and interpret these representations. They found that Wav2Vec2  succeeded at four classification tasks - dog recognition, breed identification, gender classification, and context grounding. The model also outperformed other models trained specifically on dog bark data, with accuracy figures up to 70%.

The team says they hope their work will encourage others in the NLP community to start addressing the many research opportunities that exist in the area of animal communication. So, cats or hamsters? The race is on. 


More Information

Research Paper On Using AI to Decode Dog Vocalizations

Related Articles

Learn To Chat with Your Data For Free

Whisper Open Source Speech Recognition You Can Use

The Art and Science of Conversational AI

IBM Announces AI Libraries For Natural Language Processing

Mozilla Updates Voice Recognition Project

Microsoft researchers achieve speech recognition milestone

To be informed about new articles on I Programmer, sign up for our weekly newsletter, subscribe to the RSS feed and follow us on Twitter, Facebook or Linkedin.


Amazon Timestream for InfluxDB Handles Your Time Series Workloads

Amazon has announced Timestream, a fully-managed time series database service that is based on open source InfluxDB.
But what is a time series ?

JetBrains Releases Qodana Self-Hosted

JetBrains has released Qodana Self-Hosted, a version of its code quality platform that can now be managed and maintained by the customer on their infrastructure.

More News

kotlin book



or email your comment to:

Last Updated ( Friday, 21 June 2024 )