Skip to yearly menu bar Skip to main content


Search All 2024 Events
 

54 Results

<<   <   Page 1 of 5   >   >>
Poster
Tue 4:30 Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti · Suraj Nair · Ashwin Balakrishna · Percy Liang · Thomas Kollar · Dorsa Sadigh
Oral
Tue 1:45 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin · Zhicheng Sun · Kun Xu · Kun Xu · Liwei Chen · Hao Jiang · Quzhe Huang · Chengru Song · Yuliang Liu · Di ZHANG · Yang Song · Kun Gai · Yadong Mu
Poster
Wed 4:30 FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Yuwei Fu · Haichao Zhang · di wu · Wei Xu · Benoit Boulet
Poster
Wed 4:30 Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Jinhao Li · Haopeng Li · Sarah Erfani · Lei Feng · James Bailey · Feng Liu
Poster
Tue 2:30 Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin · Zhicheng Sun · Kun Xu · Kun Xu · Liwei Chen · Hao Jiang · Quzhe Huang · Chengru Song · Yuliang Liu · Di ZHANG · Yang Song · Kun Gai · Yadong Mu
Poster
Thu 2:30 Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning
Shibo Jie · Yehui Tang · Ning Ding · Zhi-Hong Deng · Kai Han · Yunhe Wang
Poster
Tue 2:30 Machine Vision Therapy: Multimodal Large Language Models Can Enhance Visual Robustness via Denoising In-Context Learning
Zhuo Huang · Chang Liu · Yinpeng Dong · Hang Su · Shibao Zheng · Tongliang Liu
Oral
Tue 8:15 Rejuvenating image-GPT as Strong Visual Representation Learners
Sucheng Ren · Zeyu Wang · Hongru Zhu · Junfei Xiao · Alan Yuille · Cihang Xie
Poster
Tue 2:30 IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation
Kai Li · Runxuan Yang · Fuchun Sun · Xiaolin Hu
Poster
Wed 4:30 VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
yunxin li · Baotian Hu · Haoyuan Shi · Wei Wang · Longyue Wang · Min Zhang
Poster
Thu 2:30 Unifying Image Processing as Visual Prompting Question Answering
Yihao Liu · Xiangyu Chen · Xianzheng Ma · Xintao Wang · Jiantao Zhou · Yu Qiao · Chao Dong
Poster
Thu 2:30 Don't trust your eyes: on the (un)reliability of feature visualizations
Robert Geirhos · Roland S. Zimmermann · Blair Bilodeau · Wieland Brendel · Been Kim