Long-Context Foundation Models

Workshop

Long-Context Foundation Models

Tianyu Gao · Weijia Shi · Amanda Bertsch · Tri Dao · Danqi Chen · Graham Neubig · Christopher Re

Hall A2

Fri 26 Jul, midnight PDT

[ Abstract ] Workshop Website

Foundation models have become a cornerstone in the advancement of artificial intelligence, widely used across both academic and practical applications. Across domains, many challenging tasks require synthesizing information over thousands to millions of individual pieces of data, which may take many forms, including images, text, audio, genomes, etc. As a result, much recent work has focused on developing long-context models capable of processing, understanding, and generating responses based on extensive inputs. Enabling foundation models to process long contexts introduces several key challenges: (1) Computation efficiency: transformers, the predominate architecture for foundation models, incur a quadratic computational complexity with respect to the input length. (2) Lack of data: The development of long-context foundation models requires access to a large amount of long-sequence data, which is difficult to satisfy due to the limited availability of such collections. (3) Evaluation complexity: Evaluating the performance of long-context foundation models is inherently complex, as it is costly to collect, construct, or verify such evaluation data by humans.Our workshop aims to convene researchers to address these challenges, fostering discussions, developments, and evaluation of long-context foundation models across various AI disciplines.

Chat is not available.

Timezone: America/Los_Angeles

Schedule

Fri 12:00 a.m. - 12:15 a.m.	Intro to Workshop ( Intro ) > SlidesLive Video	🔗
Fri 12:15 a.m. - 12:45 a.m.	Invited Talk 1 - Mohit Iyyer ( Invited Talk ) > SlidesLive Video	🔗
Fri 12:45 a.m. - 1:00 a.m.	Oral 1: Mitigate Position Bias in Large Language Models via Scaling a Single Dimension ( Oral ) > SlidesLive Video	🔗
Fri 1:00 a.m. - 1:30 a.m.	Invited Talk 2 - Beidi Chen ( Invited Talk ) > SlidesLive Video	🔗
Fri 1:30 a.m. - 1:45 a.m.	Break for Networking and Poster Setup	🔗
Fri 1:45 a.m. - 2:30 a.m.	Poster Session 1 ( Poster Session ) >	🔗
Fri 2:30 a.m. - 3:00 a.m.	Invited Talk 3 - Albert Gu ( Invited Talk ) > SlidesLive Video	🔗
Fri 3:00 a.m. - 3:15 a.m.	Oral 2: ZigMa: A DiT-style Zigzag Mamba Diffusion Model ( Orals ) > SlidesLive Video	🔗
Fri 3:15 a.m. - 3:30 a.m.	Oral 3: Improved Algorithms for Kernel Matrix-Vector Multiplication ( Orals ) > SlidesLive Video	🔗
Fri 3:30 a.m. - 5:00 p.m.	Lunch Break	🔗
Fri 5:00 a.m. - 5:30 a.m.	Panel Discussion ( Panel Discussion ) > SlidesLive Video	🔗
Fri 5:30 a.m. - 5:45 a.m.	Oral 4: InfLLM: Training-Free Long-Context Extrapolation for LLMs with an Efficient Context Memory ( Orals ) > SlidesLive Video	🔗
Fri 5:45 a.m. - 6:00 a.m.	Oral 5: Many-Shot In-Context Learning ( Orals ) > SlidesLive Video	🔗
Fri 6:00 a.m. - 7:00 a.m.	Poster Session 2 ( Poster Session ) >	🔗
Fri 7:00 a.m. - 7:30 a.m.	Invited Talk 4 - Stella Biderman ( Invited Talk ) >	🔗
Fri 7:30 a.m. - 7:45 a.m.	Closing Remarks and Best Paper Award ( Closing Remarks and Best Paper Award ) > SlidesLive Video	🔗
-	Beijing-time Virtual Posters (1:30PM-2:30PM in Beijing, 7:30AM-8:30AM in Vienna) ( Virtual Posters ) > link Link	🔗
-	US-time Virtual Posters (9AM-10AM in San Francisco, 6PM-7PM in Vienna) ( Virtual Posters ) > link Link	🔗