Skip to yearly menu bar Skip to main content


Tutorial

Alignment Methods for Large Language Models

Mingzhi Wang · Chengdong Ma · Yaodong Yang


Abstract:

Large Language Model (LLM) alignment has become an increasingly critical topic in contemporary AI research, especially as LLMs continue to scale and integrate into real-world applications. Ensuring that LLMs generate outputs aligned with human values, preferences, and ethical considerations is essential for their safe and effective deployment. This tutorial aims to provide a comprehensive introduction to LLM alignment methods, offering a structured and accessible entry point for researchers and practitioners interested in the field. It will present key concepts and challenges, introduce fundamental approaches such as Reinforcement Learning from Human Feedback (RLHF) and Direct Preference Optimization (DPO), and, building on these foundations, review a spectrum of refinements and variants. In addition, it will cover recent advancements in game-theoretical approach to alignment and theoretical frameworks that provide a deeper understanding of alignment methodologies. Beyond theoretical insights, the tutorial will emphasize the practical aspects of LLM alignment, illustrating how these techniques are applied in real-world scenarios and guiding participants in building intuition about alignment strategies. By the end of the tutorial, attendees will gain a solid foundation in LLM alignment, equipping them with the knowledge needed to critically engage with the field, understand current research trends, and explore future directions.

Live content is unavailable. Log in and register to view live content