Skip to yearly menu bar Skip to main content

Keynote Talk
Workshop: Neural Conversational AI Workshop - What’s left to TEACH (Trustworthy, Enhanced, Adaptable, Capable and Human-centric) chatbots?

Invited Talk: New Frontiers in the Evaluation of Conversational Agents by João Sedoc


The rapid advances in large language models brought about disruptive innovations in the field of conversational agents. However, recent advances also present new challenges in evaluating the quality of such systems, as well as the underlying models and methods. As conversational agents increasingly match and or even surpass human performance in dimensions like 'coherence,' we must shift our focus to the qualities of conversational agents that are fundamental to human-like conversation (e.g., empathy and emotion). In this talk, I will focus on how we can integrate psychological metrics for evaluating conversational agents along dimensions such as emotion, empathy, and user traits. I will also introduce our Item Response Theory (IRT) framework, an innovative approach for evaluating the quality of agents across various dimensions. Finally, I will discuss future directions of conversational agent evaluation.

Chat is not available.