Skip to yearly menu bar Skip to main content


Spotlight
in
Workshop: The ICML Expressive Vocalizations (ExVo) Workshop and Competition 2022

Comparing supervised and self-supervised embedding for ExVo Multi-Task learning track

Tilak Purohit · Imen Ben Mahmoud · Bogdan Vlasenko · Mathew Magimai.-Doss


Abstract:

The ICML Expressive Vocalizations (ExVo) Multi-task challenge 2022, focuses on understanding the emotional facets of the non-linguistic vocalizations (vocal bursts (VB)). The objective of this challenge is to predict emotional intensities for VB, being a multi-task challenge it also requires to predict speakers' age and native-country. For this challenge we study and compare two distinct embedding spaces namely, self-supervised learning (SSL) based embeddings and task-specific supervise learning based embeddings. Towards that, we investigate feature representations obtained from several pre-trained SSL neural networks and task-specific supervised classification neural networks.Our studies show that best performance is obtained with an hybrid approach, where predictions derived via both SSL and task-specific supervised learning are used. Our best system on test-set surpass the ComParE baseline results by a relative 13% margin.

Chat is not available.