ICML Expo Workshop Evaluation of GenAI models

Expo Workshop

Evaluation of GenAI models

Fnu Disha · Jason Whang · biancen xie

[ Abstract ]

Abstract:

This workshop will explore cutting-edge research in evaluating and ensuring the trustworthiness of Generative AI, including Large Language Models (LLMs) and Diffusion Models. As these models become increasingly integrated into decision-making, robust evaluation is crucial. We'll delve into diverse strategies for building more reliable Generative AI across various applications. Topics include: • Holistic Evaluation: Datasets, metrics, and methodologies. • Trustworthiness: o Truthfulness: Addressing misinformation, hallucinations, inconsistencies, and biases. o Safety & Security: Preventing harmful and toxic content, and protecting privacy. o Ethics: Aligning with social norms, values, regulations, and laws. • User-Centric Assessment: Evaluating models from a user perspective. • Multi-Perspective Evaluation: Focusing on reasoning, knowledge, problem-solving, and user alignment. • Cross-Modal Evaluation: Integrating text, image, audio, and other modalities. This workshop aims to bring together researchers from machine learning, data mining, and related fields to foster interdisciplinary collaboration. Through invited talks, paper presentations, and panel discussions, we aim to share insights and spark collaborations between academia and industry. Researchers from various fields, including Data Mining, Machine Learning, NLP, and Information Retrieval, are encouraged to participate.

Chat is not available.