Researchers improve AI emotion classification by combining speech and facial expression data

Nitin Naresh January 16, 2019

0 278 2 minutes read

Systems that can classify a person’s emotion from their voice and facial tics alone are a longstanding goal of some AI researchers. Firms like Affectiva, which recently launched a product that scans drivers’ faces and voices to monitor their mood, are moving the needle in the right direction. But considerable challenges remain, owing to nuances in speech and muscle movements.
Researchers at the University of Science and Technology of China in Hefei claim to have made progress, though. In a paper published on the preprint server Arxiv.org this week (“Deep Fusion: An Attention Guided Factorized Bilinear Pooling for Audio video Emotion Recognition“), they describe an AI system that can recognize a person’s emotional state with state-of-the-art accuracy on a popular benchmark.
“Automatic emotion recognition (AER) is a challenging task due to the abstract concept and multiple expressions of emotion,” they wrote. “Inspired by this cognitive process in human beings, it’s natural to simultaneously utilize audio and visual information in AER … The whole pipeline can be completed in a neural network.”
Part of the team’s AI system consists of audio-processing algorithms that, with speech spectrograms (visual representations of the spectrum of frequencies of sound over time) as input, help the overall AI model to home in on regions most relevant to emotion. A second component runs video frames of faces through two computational layers: a basic face detection algorithm and a trio of “state-of-the-art” face recognition networks “fine-tuned” to make them “emotion-relevant.” It’s a trickier undertaking than it sounds — as the paper’s authors note, not all frames contribute equally to an emotional state, so they had to implement an attention mechanism that susses out important frames.
After features — i.e., measurable characteristics — have been extracted from all four facial recognition algorithms, they’re fused with speech features to “deeply capture” associations between them for a final emotion prediction. That’s the last step.
To “teach” the AI model to classify emotions, the team fed it 653 video and corresponding audio clips from AFEW8.0, a database of film and television shows used in the audio-video subchallenge of the EmotiW2018, a grand challenge in the ACM International Conference on Multimodal Interaction. In tests, it held its own, managing to categorize emotions from seven choices — “angry,” “disgust,” “fear,” “happy,” “neutral,” “sad,” and “surprise” — correctly about 62.48 percent of the time on a validation set of 383 samples. Moreover, the researchers demonstrated that its video frame analyses were influenced by audio signals; in other words, the AI system took the relationship between speech and facial expressions into account in making its predictions.
That said, the model tended to fare better with emotions that had “obvious” characteristics like “angry,” “happy,” and “neutral,” while struggling with “disgust,” “surprise,” and other emotions with “weak” expressions or that could be easily confused with other emotions. Still, it performed nearly as well as a previous approach that employed five visual models and two audio models.
“Compared with the state-of-the-art approach,” the researchers wrote, “[our] proposed approach can achieve a comparable result with a single model, and make a new milestone with multi-models.”
Source: VentureBeat

Follow Us On Facebook, Twitter & Instagram . Please Share Your Stories, Press Release & Articles At [email protected]

Nitin Naresh January 16, 2019

0 278 2 minutes read

WhatsApp Tells Delhi High Court It Will Shut Down If Forced To Break Encryption; Can The Indian Government Ask For Anything And Everything? What About Privacy Laws, Are We Becoming China?

RBI Slamming The Breaks On Kotak Mahindra Bank At The Critical Time Of Elections, What’s The Story, How Will It Affect Kotak Customers?

Unmasking Patanjali and FMCG’s Deceptive Marketing: Supreme Court’s Stand Against Misleading Ads!

Swiggy’s IPO Plans, Secures Shareholder Approval For A Potential $1.2 Billion IPO

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

What Is Project Nimbus? Why Are Google Employees Protesting It? Do Tech Companies Have Ties With The Military?

MDH and Everest Spice Banned in Singapore and HongKong; Can they Cause Cancer?

Finally Ankiti Bose Founder & Ex-CEO Of Zilingo Filed Retaliatory Sexual Harassment Complaint Against Co-Founder For Blackmailing & Extortion

NOTA, No Votes and Unopposed Nominations: The Grey Areas of the Indian Election Process Explained

Can A Bigger ‘Sorry’ Apology Ad Undo The Fraud Committed By Baba Ramdev’s Patanjali? Why Has The License Not Been Cancelled, And Why Is There No Fine? Should Indian Citizens Forgive Him So Easily?

Researchers improve AI emotion classification by combining speech and facial expression data

Nitin Naresh

Read Next

Unmasking Patanjali and FMCG’s Deceptive Marketing: Supreme Court’s Stand Against Misleading Ads!

Swiggy’s IPO Plans, Secures Shareholder Approval For A Potential $1.2 Billion IPO

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

Unmasking Patanjali and FMCG’s Deceptive Marketing: Supreme Court’s Stand Against Misleading Ads!

Swiggy’s IPO Plans, Secures Shareholder Approval For A Potential $1.2 Billion IPO

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

Leave a Reply Cancel reply

Top 10 Best Artificial Intelligence (AI) Companies of India in 2022

Top 10 Best Agriculture Companies in India 2022

Ampere launches new chip built from ground up for cloud workloads

Acer may shutter or sell StarVR after location-based VR revenues sink

Indonesia short on oxygen, seeks help as virus cases soar

Floods- Why are Pune and Mumbai prone to it?

The solar storms will hit the Earth and cause disruption in GPS and mobile connectivity.

The death of democracy in India

Employee Engagement In The Hybrid Workplace Of The Future

Read Next

Unmasking Patanjali and FMCG’s Deceptive Marketing: Supreme Court’s Stand Against Misleading Ads!

Swiggy’s IPO Plans, Secures Shareholder Approval For A Potential $1.2 Billion IPO

United Nations Turns Into Battleground As United States And Russia Clash Over Nuclear Weapons In Space; How Dominance In Space Is Opening A 4th Dimension In Warfare, And A Worrying One!

SpaceX opts for Texas over LA to build Starship prototypes

Resolute Ventures sticks to its knitting with $75 million fourth fund

Related Articles

Leave a Reply Cancel reply

Top 10 Best Artificial Intelligence (AI) Companies of India in 2022

Top 10 Best Agriculture Companies in India 2022

Ampere launches new chip built from ground up for cloud workloads

Acer may shutter or sell StarVR after location-based VR revenues sink

Indonesia short on oxygen, seeks help as virus cases soar

Floods- Why are Pune and Mumbai prone to it?

The solar storms will hit the Earth and cause disruption in GPS and mobile connectivity.

The death of democracy in India

Employee Engagement In The Hybrid Workplace Of The Future

Adblock Detected