ICML Expo Talk Panel JokeEval: Are the Jokes Funny? Review of Computational Evaluation Techniques to improve Joke Generation

Expo Talk Panel

JokeEval: Are the Jokes Funny? Review of Computational Evaluation Techniques to improve Joke Generation

Sulbha Jain

[ Abstract ]

Abstract:

Humor is a nuanced and essential facet of human communication, often relying on
incongruity, surprise, and cultural context to elicit amusement. This paper presents
JokeEval, a computational framework designed to evaluate the quality of AIgenerated
jokes. Through empirical experiments on both synthetic and open-source
datasets, we demonstrate that machine learning techniques—particularly a hybrid
Convolutional Neural Network with recurrent layers—can effectively distinguish
between “Funny” and “Not Funny” jokes, achieving a statistically significant
F1-score of 71.2% on the ColBERT dataset. Our methodology leverages highdimensional
vector embeddings, crowd-sourced human annotations, and diverse
evaluation pipelines—including supervised classifiers, deep neural networks, and
LLM-as-a-judge protocols—to assess humor at scale. In doing so, we highlight both
the promise and current limitations of AI in understanding and generating humor.
The results pave the way for more engaging, human-aligned content generation
and offer a feedback loop to iteratively improve joke-writing capabilities in virtual
assistants and other AI-driven systems.

Chat is not available.