Researchers improve robots’ speech recognition by modeling human auditory processing

Nitin Naresh February 15, 2019

0 330 2 minutes read

We rarely think too much about noises as we’re listening to them, but there’s an enormous amount of complexity involved in isolating audio from places like crowded city squares and busy department stores. In the lower levels of our auditory pathways, we segregate individual sources from backgrounds, localize them in space, and detect their motion patterns — all before we work out their context.
Inspired by this neurophysiology, a team of researchers shared in a preprint paper on Arxiv.org (“Enhanced Robot Speech Recognition Using Biomimetic Binaural Sound Source Localization“) a design devised to test the influence of physiognomy — that is, facial features — on the components of sound recognition, like sound source localization (SSL) and automatic speech recognition (ASR).
As the researchers note, the torso, head, and pinnae (the external part of the ears) absorb and reflect sound waves as they approach the body, modifying the frequency depending on the source’s location. They travel to the cochlea (the spiral cavity of the inner ear) and the Corti within, which produces nerve impulses in response to sound vibrations. Those impulses are delivered through the auditory nervous system to the cochlear nucleus, a sort of relay station that forwards information to two structures: the medial superior olive (MSO) and the lateral superior olive (LSO). (The MSO is thought to help locate the angle to the left or right to pinpoint the sound’s source, while the LSO uses intensity to localize the sound source.) Finally, they’re integrated in the brain’s inferior colliculus (IC).
In an effort to replicate the structure algorithmically, the researchers designed a machine learning framework that processed sound recorded by microphones embedded in humanoid robotic heads — the iCub and Soundman. It comprised four parts: an SSL component that decomposed audio into sets of frequencies and used the frequency waves to generate spikes mimicking the Corti’s neural impulses; an MSO model sensitive to sounds produced at certain angles; an LSO model sensitive to other angles; and an IC-inspired layer that combined signals from the MSO and LSO. An additional neural network minimized reverberation and ego noise (noise generated by the robot’s joints and motors).
To test the system’s performance, the researchers used Soundman to establish SSL and ASR baselines and the iCub head (outfitted with motors that allowed it to rotate) to determine the effect of resonance from the skull and components within. A group of 13 evenly distributed loudspeakers in a half-cylinder configuration blasted noise toward the heads, which detected and processed it.
The team found that data from SSL could “improve considerably” — in some cases by a factor of two at the sentence level — the accuracy of speech recognition by indicating how to position the robotic heads and by selecting the appropriate channel as input to an ASR system. Performance was even better when the pinnae were removed from the heads.
“This approach is in contrast to related approaches where signals from both channels are averaged before being used for ASR,” the paper’s authors wrote. “The results of the dynamic SSL experiment show that the architecture is capable of handling different kinds of reverberation. These results are an important extension from our previous work in static SSL and support the robustness of the system to the sound dynamics in real-world environments. Furthermore, our system can be easily integrated with recent methods to enhance ASR in reverberant environments [55]–[57] without adding computational cost.”
Source: VentureBeat

Follow Us On Facebook, Twitter & Instagram . Please Share Your Stories, Press Release & Articles At [email protected]. To Read More News Daily, Subscribe To Our Push Notification at https://www.inventiva.co.in/

Unmasking Patanjali and FMCG’s Deceptive Marketing: Supreme Court’s Stand Against Misleading Ads!

Swiggy’s IPO Plans, Secures Shareholder Approval For A Potential $1.2 Billion IPO

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

What Is Project Nimbus? Why Are Google Employees Protesting It? Do Tech Companies Have Ties With The Military?

MDH and Everest Spice Banned in Singapore and HongKong; Can they Cause Cancer?

Finally Ankiti Bose Founder & Ex-CEO Of Zilingo Filed Retaliatory Sexual Harassment Complaint Against Co-Founder For Blackmailing & Extortion

NOTA, No Votes and Unopposed Nominations: The Grey Areas of the Indian Election Process Explained

Can A Bigger ‘Sorry’ Apology Ad Undo The Fraud Committed By Baba Ramdev’s Patanjali? Why Has The License Not Been Cancelled, And Why Is There No Fine? Should Indian Citizens Forgive Him So Easily?

India’s Biggest Worry, Unemployment, Reuters Poll; How Modi Govt Has Failed To Address The Critical Issue Amid ‘White Washing’; Where Are Our Jobs?

Bye Bye Tesla! Tesla’s Change In Strategy Bores’ Gloomy Skies’ Over India Factory; Tesla’s Earnings Plunge, But The Company Promises Cheaper Car Model

Researchers improve robots’ speech recognition by modeling human auditory processing

Nitin Naresh

Read Next

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

What Is Project Nimbus? Why Are Google Employees Protesting It? Do Tech Companies Have Ties With The Military?

NOTA, No Votes and Unopposed Nominations: The Grey Areas of the Indian Election Process Explained

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

What Is Project Nimbus? Why Are Google Employees Protesting It? Do Tech Companies Have Ties With The Military?

NOTA, No Votes and Unopposed Nominations: The Grey Areas of the Indian Election Process Explained

Leave a Reply Cancel reply

Top 10 Best Agriculture Companies in India 2022

Top 10 Best Artificial Intelligence (AI) Companies of India in 2022

Ampere launches new chip built from ground up for cloud workloads

Acer may shutter or sell StarVR after location-based VR revenues sink

Indonesia short on oxygen, seeks help as virus cases soar

Floods- Why are Pune and Mumbai prone to it?

The solar storms will hit the Earth and cause disruption in GPS and mobile connectivity.

The death of democracy in India

Employee Engagement In The Hybrid Workplace Of The Future

Read Next

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

What Is Project Nimbus? Why Are Google Employees Protesting It? Do Tech Companies Have Ties With The Military?

NOTA, No Votes and Unopposed Nominations: The Grey Areas of the Indian Election Process Explained

Future unicorns: Algorithm predicts the next $1B companies, including one Seattle startup

Palmetto gets $20 million credit line for its solar and energy efficiency installation marketplace

Related Articles

Leave a Reply Cancel reply

Top 10 Best Agriculture Companies in India 2022

Top 10 Best Artificial Intelligence (AI) Companies of India in 2022

Ampere launches new chip built from ground up for cloud workloads

Acer may shutter or sell StarVR after location-based VR revenues sink

Indonesia short on oxygen, seeks help as virus cases soar

Floods- Why are Pune and Mumbai prone to it?

The solar storms will hit the Earth and cause disruption in GPS and mobile connectivity.

The death of democracy in India

Employee Engagement In The Hybrid Workplace Of The Future

Adblock Detected