Toggle Poster Visibility
Oral
Thu Jul 17 03:30 PM -- 03:45 PM (PDT) @ West Ballroom C None
On Path to Multimodal Generalist: General-Level and General-Bench
[
OpenReview]
Oral
Thu Jul 17 03:45 PM -- 04:00 PM (PDT) @ West Ballroom C None
What Limits Virtual Agent Application? OmniBench: A Scalable Multi-Dimensional Benchmark for Essential Virtual Agent Capabilities
[
OpenReview]
Oral
Thu Jul 17 04:00 PM -- 04:15 PM (PDT) @ West Ballroom C None
How Do Large Language Monkeys Get Their Power (Laws)?
[
OpenReview]
Oral
Thu Jul 17 04:15 PM -- 04:30 PM (PDT) @ West Ballroom C None
Suitability Filter: A Statistical Framework for Classifier Evaluation in Real-World Deployment Settings
[
OpenReview]