Trustworthy Multi-modal Foundation Models and AI Agents (TiFA)

Workshop

Trustworthy Multi-modal Foundation Models and AI Agents (TiFA)

Zhenfei (Jeremy) Yin · Yawen Duan · Lijun Li · Jianfeng Chi · Yichi Zhang · Pavel Izmailov · Bo Li · Andy Zou · Yaodong Yang · Hang Su · Jing Shao · Yu Qiao · Jun Zhu · Xuanjing Huang · Wanli Ouyang · Dacheng Tao · Phil Torr

[ Abstract ] Workshop Website

[ Project Page ]

Advanced Multi-modal Foundation Models (MFMs) and AI Agents, equipped with diverse modalities [1, 2, 3, 4, 15] and an increasing number of available affordances [5, 6] (e.g., tool use, code interpreter, API access, etc.), have the potential to accelerate and amplify their predecessors’ impact on society [7].

MFM includes multi-modal large language models (MLLMs) and multi-modal generative models (MMGMs). MLLMs refer to LLM-based models with the ability to receive, reason, and output with information of multiple modalities, including but not limited to text, images, audio, and video. Examples include Llava [1], Reka [8], QwenVL [9], LAMM [36],and so on. MMGMs refer to a class of MFM models that can generate new content across multiple modalities, such as generating images from text descriptions or creating videos from audio and text inputs. Examples include Stable Diffusion [2], Sora [10], and Latte [11]. AI agents, or systems with higher degree of agenticness, refer to systems that could achieve complex goals in complex environments with limited direct supervision [12]. Understanding and preempting the vulnerabilities of these systems [13, 35] and their induced harms [14] becomes unprecedentedly crucial.

Building trustworthy MFMs and AI Agents transcends adversarial robustness of such models, but also emphasizes the importance of proactive risk assessment, mitigation, safeguards, and the establishment of comprehensive safety mechanisms throughout the lifecycle of the systems’ development and deployment [16, 17]. This approach demands a blend of technical and socio-technical strategies, incorporating AI governance and regulatory insights to build trustworthy MFMs and AI Agents.

Topics include but are not limited to: - Adversarial attack and defense, poisoning, hijacking and security [18, 13, 19, 20, 21] - Robustness to spurious correlations and uncertainty estimation - Technical approaches to privacy, fairness, accountability and regulation [12, 22, 28] - Truthfulness, factuality, honesty and sycophancy [23, 24] - Transparency, interpretability and monitoring [25, 26] - Identifiers of AI-generated material, such as watermarking [27] - Technical alignment / control , such as scalable overslight [29], representation control [26] and machine unlearning [30] - Model auditing, red-teaming and safety evaluation benchmarks [31, 32, 33, 16] - Measures against malicious model fine-tuning [34] - Novel safety challenges with the introduction of new modalities

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Sat 12:30 a.m. - 12:40 a.m.	Opening Remark ( Opening Remark ) > SlidesLive Video	Jing Shao 🔗
Sat 12:40 a.m. - 1:10 a.m.	A Data-Centric View on Reliable Generalization ( Invited Talk ) > SlidesLive Video	Ludwig Schmidt 🔗
Sat 1:10 a.m. - 1:40 a.m.	Robust Alignment and Control with Representation Engineering ( Invited Talk ) > SlidesLive Video	Matt Fredrikson 🔗
Sat 1:40 a.m. - 1:50 a.m.	Coffee Break	🔗
Sat 1:50 a.m. - 2:40 a.m.	Security and Safety of AI Agents ( Panel Discussion ) > SlidesLive Video	Daniel Paleka · Matt Fredrikson · Alan Chan · Ivan Evtimov · Kai Greshake · Tomasz Korbak 🔗
Sat 2:40 a.m. - 3:00 a.m.	The Safety in Large Language Models ( ContributedTalk ) > SlidesLive Video	Yisen Wang 🔗
Sat 3:00 a.m. - 3:10 a.m.	Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? ( Oral Session ) > SlidesLive Video	🔗
Sat 3:10 a.m. - 3:20 a.m.	Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques ( Oral Session ) > SlidesLive Video	🔗
Sat 3:20 a.m. - 4:30 a.m.	Lunch Break	🔗
Sat 4:30 a.m. - 5:00 a.m.	Agent Governance ( Invited Talk ) > SlidesLive Video	Alan Chan 🔗
Sat 5:00 a.m. - 5:30 a.m.	UK AI Safety Institute: Overview & Agents Evals ( Invited Talk ) > SlidesLive Video	Herbie Bradley 🔗
Sat 5:30 a.m. - 5:50 a.m.	TiFA Challenge Takeaways ( Contributed Talk ) > SlidesLive Video	Lijun Li · Bowen DONG 🔗
Sat 5:50 a.m. - 6:10 a.m.	Break	🔗
Sat 6:10 a.m. - 6:50 a.m.	Paper Lightning Talks ( Paper Lightning Talk ) > SlidesLive Video	Dehan Kong · Zhuo ZHI · Shiyang Lai · John Heibel · Orr Paradise · Marvin Li · Jeffrey Wang 🔗
Sat 6:50 a.m. - 8:00 a.m.	Poster Session ( Poster Session ) >	🔗
-	Games for AI-Control: Models of Safety Evaluations of AI Deployment Protocols ( Poster ) > link Link	Charlie Griffin · Buck Shlegeris · Alessandro Abate 🔗
-	Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity ( Poster ) > link Link	Zhuo ZHI · Ziquan Liu · Moe Elbadawi · Adam Daneshmend · Mine Orlu · Abdul Basit · Andreas Demosthenous · Miguel Rodrigues 🔗
-	Decomposed evaluations of geographic disparities in text-to-image models ( Poster ) > link Link	Abhishek Sureddy · Dishant Padalia · Nandhinee Periyakaruppan · Oindrila Saha · Adina Williams · Adriana Romero Soriano · Megan Richards · Polina Kirichenko · Melissa Hall 🔗
-	Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models ( Poster ) > link Link	Zhenyang Ni · Rui Ye · Yuxi Wei · Zhen Xiang · Yanfeng Wang · Siheng Chen 🔗
-	MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants ( Poster ) > link Link	John Heibel · Daniel Lowd 🔗
-	Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques ( Poster ) > link Link	Rishika Bhagwatkar · Shravan Nayak · Reza Bayat · Alexis Roger · Daniel Kaplan · Pouya Bashivan · Irina Rish 🔗
-	TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution ( Poster ) > link Link	Wenyue Hua · Xianjun Yang · Mingyu Jin · Zelong Li · Wei Cheng · Ruixiang Tang · Yongfeng Zhang 🔗
-	Bias Begets Bias: the Impact of Biased Embeddings on Diffusion Models ( Poster ) > link Link	Sahil Kuchlous · Marvin Li · Jeffrey Wang 🔗
-	Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs ( Poster ) > link Link	Jinmin Li · Kuofeng Gao · Yang Bai · Jingyun Zhang · Shutao Xia 🔗
-	Unfamiliar Finetuning Examples Control How Language Models Hallucinate ( Poster ) > link Link	Katie Kang · Eric Wallace · Claire Tomlin · Aviral Kumar · Sergey Levine 🔗
-	Can Editing LLMs Inject Harm? ( Poster ) > link Link	15 presenters Canyu Chen · Baixiang Huang · Zekun Li · Zhaorun Chen · Shiyang Lai · Xiongxiao Xu · Jia-Chen Gu · Jindong Gu · Huaxiu Yao · Chaowei Xiao · Xifeng Yan · William Wang · Phil Torr · Dawn Song · Kai Shu 🔗
-	VACoDe: Visual Augmented Contrastive Decoding ( Poster ) > link Link	Sihyeon Kim · Boryeong Cho · Sangmin Bae · Sumyeong Ahn · Se-Young Yun 🔗
-	Chained Tuning Leads to Biased Forgetting ( Poster ) > link Link	Megan Ung · Alicia Sun · Samuel Bell · Levent Sagun · Adina Williams 🔗
-	Models That Prove Their Own Correctness ( Poster ) > link Link	Noga Amit · Shafi Goldwasser · Orr Paradise · Guy Rothblum 🔗
-	On the Multi-modal Vulnerability of Diffusion Models ( Poster ) > link Link	Dingcheng Yang · Yang Bai · Xiaojun Jia · Yang Liu · Xiaochun Cao · Wenjian Yu 🔗
-	Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? ( Poster ) > link Link	Rylan Schaeffer · Hailey Schoelkopf · Brando Miranda · Gabriel Mukobi · Varun Madan · Adam Ibrahim · Herbie Bradley · Stella Biderman · Sanmi Koyejo 🔗
-	On the Difficulty of Faithful Chain-of-Thought Reasoning in Large Language Models ( Poster ) > link Link	Sree Harsha Tanneru · Dan Ley · Chirag Agarwal · Himabindu Lakkaraju 🔗
-	Wasserstein Modality Alignment Makes Your Multimodal Transformer More Robust ( Poster ) > link Link	Zhuo ZHI · Ziquan Liu · Qiangqiang Wu · Miguel Rodrigues 🔗
-	WebCanvas: Benchmarking Web Agents in Online Environments ( Poster ) > link Link	11 presenters Yichen Pan · Dehan Kong · Sida Zhou · Cheng Cui · Yifei Leng · Bing Jiang · Hangyu Liu · Yanni Shawn · Shuyan Zhou · Sherry Tongshuang Wu · Zhengyang Wu 🔗