Workshop
Trustworthy Multi-modal Foundation Models and AI Agents (TiFA)
Zhenfei (Jeremy) Yin · Yawen Duan · Lijun Li · Jianfeng Chi · Yichi Zhang · Pavel Izmailov · Bo Li · Andy Zou · Yaodong Yang · Hang Su · Jing Shao · Yu Qiao · Jun Zhu · Xuanjing Huang · Wanli Ouyang · Dacheng Tao · Phil Torr
Straus 1
Sat 27 Jul, 12:30 a.m. PDT
Advanced Multi-modal Foundation Models (MFMs) and AI Agents, equipped with diverse modalities [1, 2, 3, 4, 15] and an increasing number of available affordances [5, 6] (e.g., tool use, code interpreter, API access, etc.), have the potential to accelerate and amplify their predecessors’ impact on society [7].
MFM includes multi-modal large language models (MLLMs) and multi-modal generative models (MMGMs). MLLMs refer to LLM-based models with the ability to receive, reason, and output with information of multiple modalities, including but not limited to text, images, audio, and video. Examples include Llava [1], Reka [8], QwenVL [9], LAMM [36],and so on. MMGMs refer to a class of MFM models that can generate new content across multiple modalities, such as generating images from text descriptions or creating videos from audio and text inputs. Examples include Stable Diffusion [2], Sora [10], and Latte [11]. AI agents, or systems with higher degree of agenticness, refer to systems that could achieve complex goals in complex environments with limited direct supervision [12]. Understanding and preempting the vulnerabilities of these systems [13, 35] and their induced harms [14] becomes unprecedentedly crucial.
Building trustworthy MFMs and AI Agents transcends adversarial robustness of such models, but also emphasizes the importance of proactive risk assessment, mitigation, safeguards, and the establishment of comprehensive safety mechanisms throughout the lifecycle of the systems’ development and deployment [16, 17]. This approach demands a blend of technical and socio-technical strategies, incorporating AI governance and regulatory insights to build trustworthy MFMs and AI Agents.
Topics include but are not limited to: - Adversarial attack and defense, poisoning, hijacking and security [18, 13, 19, 20, 21] - Robustness to spurious correlations and uncertainty estimation - Technical approaches to privacy, fairness, accountability and regulation [12, 22, 28] - Truthfulness, factuality, honesty and sycophancy [23, 24] - Transparency, interpretability and monitoring [25, 26] - Identifiers of AI-generated material, such as watermarking [27] - Technical alignment / control , such as scalable overslight [29], representation control [26] and machine unlearning [30] - Model auditing, red-teaming and safety evaluation benchmarks [31, 32, 33, 16] - Measures against malicious model fine-tuning [34] - Novel safety challenges with the introduction of new modalities
Schedule
Sat 12:30 a.m. - 12:40 a.m.
|
Opening Remark
(
Opening Remark
)
>
SlidesLive Video |
Jing Shao 🔗 |
Sat 12:40 a.m. - 1:10 a.m.
|
A Data-Centric View on Reliable Generalization
(
Invited Talk
)
>
SlidesLive Video |
Ludwig Schmidt 🔗 |
Sat 1:10 a.m. - 1:40 a.m.
|
Robust Alignment and Control with Representation Engineering
(
Invited Talk
)
>
SlidesLive Video |
Matt Fredrikson 🔗 |
Sat 1:40 a.m. - 1:50 a.m.
|
Coffee Break
|
🔗 |
Sat 1:50 a.m. - 2:40 a.m.
|
Security and Safety of AI Agents
(
Panel Discussion
)
>
SlidesLive Video |
Daniel Paleka · Matt Fredrikson · Alan Chan · Ivan Evtimov · Kai Greshake · Tomasz Korbak 🔗 |
Sat 2:40 a.m. - 3:00 a.m.
|
The Safety in Large Language Models
(
ContributedTalk
)
>
SlidesLive Video |
Yisen Wang 🔗 |
Sat 3:00 a.m. - 3:10 a.m.
|
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive?
(
Oral Session
)
>
SlidesLive Video |
🔗 |
Sat 3:10 a.m. - 3:20 a.m.
|
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques
(
Oral Session
)
>
SlidesLive Video |
🔗 |
Sat 3:20 a.m. - 4:30 a.m.
|
Lunch Break
|
🔗 |
Sat 4:30 a.m. - 5:00 a.m.
|
Agent Governance
(
Invited Talk
)
>
SlidesLive Video |
Alan Chan 🔗 |
Sat 5:00 a.m. - 5:30 a.m.
|
UK AI Safety Institute: Overview & Agents Evals
(
Invited Talk
)
>
SlidesLive Video |
Herbie Bradley 🔗 |
Sat 5:30 a.m. - 5:50 a.m.
|
TiFA Challenge Takeaways
(
Contributed Talk
)
>
SlidesLive Video |
Lijun Li · Bowen DONG 🔗 |
Sat 5:50 a.m. - 6:10 a.m.
|
Break
|
🔗 |
Sat 6:10 a.m. - 6:50 a.m.
|
Paper Lightning Talks
(
Paper Lightning Talk
)
>
SlidesLive Video |
Dehan Kong · Zhuo ZHI · Shiyang Lai · John Heibel · Orr Paradise · Marvin Li · Jeffrey Wang 🔗 |
Sat 6:50 a.m. - 8:00 a.m.
|
Poster Session
(
Poster Session
)
>
|
🔗 |
-
|
Games for AI-Control: Models of Safety Evaluations of AI Deployment Protocols ( Poster ) > link | Charlie Griffin · Buck Shlegeris · Alessandro Abate 🔗 |
-
|
Borrowing Treasures from Neighbors: In-Context Learning for Multimodal Learning with Missing Modalities and Data Scarcity ( Poster ) > link | Zhuo ZHI · Ziquan Liu · Moe Elbadawi · Adam Daneshmend · Mine Orlu · Abdul Basit · Andreas Demosthenous · Miguel Rodrigues 🔗 |
-
|
Decomposed evaluations of geographic disparities in text-to-image models ( Poster ) > link | Abhishek Sureddy · Dishant Padalia · Nandhinee Periyakaruppan · Oindrila Saha · Adina Williams · Adriana Romero Soriano · Megan Richards · Polina Kirichenko · Melissa Hall 🔗 |
-
|
Physical Backdoor Attack can Jeopardize Driving with Vision-Large-Language Models ( Poster ) > link | Zhenyang Ni · Rui Ye · Yuxi Wei · Zhen Xiang · Yanfeng Wang · Siheng Chen 🔗 |
-
|
MaPPing Your Model: Assessing the Impact of Adversarial Attacks on LLM-based Programming Assistants ( Poster ) > link | John Heibel · Daniel Lowd 🔗 |
-
|
Towards Adversarially Robust Vision-Language Models: Insights from Design Choices and Prompt Formatting Techniques ( Poster ) > link | Rishika Bhagwatkar · Shravan Nayak · Reza Bayat · Alexis Roger · Daniel Kaplan · Pouya Bashivan · Irina Rish 🔗 |
-
|
TrustAgent: Towards Safe and Trustworthy LLM-based Agents through Agent Constitution ( Poster ) > link | Wenyue Hua · Xianjun Yang · Mingyu Jin · Zelong Li · Wei Cheng · Ruixiang Tang · Yongfeng Zhang 🔗 |
-
|
Bias Begets Bias: the Impact of Biased Embeddings on Diffusion Models ( Poster ) > link | Sahil Kuchlous · Marvin Li · Jeffrey Wang 🔗 |
-
|
Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs ( Poster ) > link | Jinmin Li · Kuofeng Gao · Yang Bai · Jingyun Zhang · Shutao Xia 🔗 |
-
|
Unfamiliar Finetuning Examples Control How Language Models Hallucinate ( Poster ) > link | Katie Kang · Eric Wallace · Claire Tomlin · Aviral Kumar · Sergey Levine 🔗 |
-
|
Can Editing LLMs Inject Harm? ( Poster ) > link |
15 presentersCanyu Chen · Baixiang Huang · Zekun Li · Zhaorun Chen · Shiyang Lai · Xiongxiao Xu · Jia-Chen Gu · Jindong Gu · Huaxiu Yao · Chaowei Xiao · Xifeng Yan · William Wang · Phil Torr · Dawn Song · Kai Shu |
-
|
VACoDe: Visual Augmented Contrastive Decoding ( Poster ) > link | Sihyeon Kim · Boryeong Cho · Sangmin Bae · Sumyeong Ahn · Se-Young Yun 🔗 |
-
|
Chained Tuning Leads to Biased Forgetting ( Poster ) > link | Megan Ung · Alicia Sun · Samuel Bell · Levent Sagun · Adina Williams 🔗 |
-
|
Models That Prove Their Own Correctness ( Poster ) > link | Noga Amit · Shafi Goldwasser · Orr Paradise · Guy Rothblum 🔗 |
-
|
On the Multi-modal Vulnerability of Diffusion Models ( Poster ) > link | Dingcheng Yang · Yang Bai · Xiaojun Jia · Yang Liu · Xiaochun Cao · Wenjian Yu 🔗 |
-
|
Why Has Predicting Downstream Capabilities of Frontier AI Models with Scale Remained Elusive? ( Poster ) > link | Rylan Schaeffer · Hailey Schoelkopf · Brando Miranda · Gabriel Mukobi · Varun Madan · Adam Ibrahim · Herbie Bradley · Stella Biderman · Sanmi Koyejo 🔗 |
-
|
On the Difficulty of Faithful Chain-of-Thought Reasoning in Large Language Models ( Poster ) > link | Sree Harsha Tanneru · Dan Ley · Chirag Agarwal · Himabindu Lakkaraju 🔗 |
-
|
Wasserstein Modality Alignment Makes Your Multimodal Transformer More Robust ( Poster ) > link | Zhuo ZHI · Ziquan Liu · Qiangqiang Wu · Miguel Rodrigues 🔗 |
-
|
WebCanvas: Benchmarking Web Agents in Online Environments ( Poster ) > link |
11 presentersYichen Pan · Dehan Kong · Sida Zhou · Cheng Cui · Yifei Leng · Bing Jiang · Hangyu Liu · Yanni Shawn · Shuyan Zhou · Sherry Tongshuang Wu · Zhengyang Wu |