Workshop
Next Generation of AI Safety
Ian Kivlichan · Shibani Santurkar · Alex Beutel · Aleksander Madry · Preethi Lahoti · Ahmad Beirami · Adina Williams · Beyza Ermis · Tatsunori Hashimoto
Hall A1
Fri 26 Jul, midnight PDT
In recent years, general-purpose AI has experienced a meteoric rise in capabilities and applications. This rise has continued to bring forth new safety challenges, requiring mitigation to ensure AI systems meet trustworthiness standards. In this workshop, we take a proactive approach to safety and focus on five emerging trends in AI and explore the challenges associated with deploying these technologies safely:1. Agentic AI: As AI agents become more autonomous, concerns about unintended consequences, ethical issues, and adversary exploitation emerge. How do we ensure these agents respect privacy, and adhere to safety protocols?2. Multimodal: With the evolution of AI systems to process and generate diverse modalities like audio, video, and images, concerns around content appropriateness, privacy, bias, and misinformation arise. How do we craft robust guidelines and security measures to tackle these challenges?3. Personalized Interactions: As conversational agents evolve for social and personal interaction, risks like data privacy breaches and echo chambers grow. How do we balance tailored experiences with user safety?4. Sensitive Applications: With AI’s integration into high-risk domains like legal, medical, and mental health, the stakes rise with risks such as overreliance on automation and potential catastrophic errors. How do we ensure that AI systems in these critical areas enhance decision-making without compromising human expertise and judgment? 5. Dangerous Capabilities: As AI's knowledge and understanding capabilities improve, these systems could be leveraged to extract or generate information about harmful applications or technologies, including bioweapons or cyber attack methods. How do we ensure that AI systems are designed with safeguards to prevent their misuse in creating or disseminating dangerous knowledge, while still allowing for beneficial research and innovation?We believe this next frontier of capabilities and applications raises new research questions: What does the next frontier in AI safety look like? How do we evaluate it? And how can we develop strong safeguards for tomorrow’s AI systems?Combatting the novel challenges of next generation AI systems necessitates new safety techniques, spanning areas such as synthetic data generation and utilization, content moderation, and model training methodologies. The proliferation of open-source and personalized models tailored for various applications widens the scope of deployments, and amplifies the already-urgent need for robust safety tools. Moreover, this diverse range of potential deployments entails complex trade-offs between safety objectives and operational efficiency. Taken together, there is a broad set of urgent and unique research challenges and opportunities to ensure the safety of the AI systems of tomorrow.Goal: In this workshop, we will bring together researchers across academia and industry working on improving safety and alignment of state-of-the-art AI systems as they are deployed. We aim for the event to facilitate sharing of challenges, best practices, new research ideas, data, and evaluations, that both practically aid development and spur progress in this area.