Workshop
AI for Math Workshop
Yinya Huang · Xiaodan Liang · Zhengying Liu · Pan Lu · Sean Welleck · Isabelle Guyon · Amaury Hayat · Bin Dong · Mateja Jamnik · Guangrun Wang · Zhenguo Li · Linqi Song · Wei Shi · Xiongwei Han
Lehar 1
Fri 26 Jul, midnight PDT
Mathematical reasoning is one of the most advanced forms of human intelligence. Humans develop formal languages for rigorously describing mathematical problems and deriving mathematical knowledge. The machine learning community has endeavored to develop neural models with mathematical reasoning capabilities as humans. On the other hand, a shared vision in the community is that the models collaborate with humans for mathematical discoveries. The goal of this workshop is to bring together researchers working on various domains to discuss the progress and the future of applying AI technologies to mathematics. As mathematics is fundamental for almost all modern sciences (including computer science), a vast range of related topics are also within our scope. To this end, this workshop focuses on several crucial yet underexplored problems. Specifically, we are expecting attendants from various backgrounds, institutions, and disciplines to discuss areas related to the following: * Autoformalization and the reversed auto-informalization: How can we develop methods that improve the precision of the autoformalization process from natural language proof to formal proof, and as a dual process describing a formal proof in natural language?* Automated theorem proving: How do build consistent theorem proving? How do we relieve or solve the intermediate step errors in proving? * Automated theorem generation: Can neural models generate new and practically valid theorems? How do we take full advantage of such generated new theorems? * Code augmentation and auxiliary for mathematical reasoning: How can the handy and plentiful code data facilitate the models to conduct mathematical reasoning? * Formal verification and code generation: How can progress made in AI for Math help or be directly deployed to the field of formal verification? What are the common technical difficulties? How can AI systems be able to write provably correct code, given any (formal) specifications?In addition to the problem areas above, we also welcome research work related to the following topics:* Measurement: How do we measure autoformalization?* Reasoning in related areas: program synthesis, software verification, neurosymbolic reasoning, logical reasoning.* Applications: Applying mathematical reasoning techniques to sciences, finance, education, etc. Our workshop also includes an innovative challenge with three tracks for related mathematical reasoning problems:* Challenge/Track 1: Autoformalizaiton.* Challenge/Track 2: Automated theorem generation and proving.* Challenge/Track 3: Automated optimization problem-solving with code.
Schedule
Fri 12:00 a.m. - 12:05 a.m.
|
Opening Remarks
(
Opening Remarks
)
>
SlidesLive Video |
🔗 |
Fri 12:05 a.m. - 12:35 a.m.
|
Invited Talk - Mathematical Discoveries from program Search with Large Language Models
(
Invited Talk - M. Pawan Kumar
)
>
SlidesLive Video |
🔗 |
Fri 12:35 a.m. - 1:05 a.m.
|
Invited Talk - Language Model Agents for Formal Theorem-Proving
(
Invited Talk - Swarat Chaudhuri
)
>
SlidesLive Video |
🔗 |
Fri 1:05 a.m. - 1:35 a.m.
|
Invited Talk - From Pattern Matching and Beyond in Autoformalisation
(
Invited Talk - Wenda Li
)
>
SlidesLive Video |
🔗 |
Fri 1:35 a.m. - 2:00 a.m.
|
Coffee Break
|
🔗 |
Fri 2:00 a.m. - 2:30 a.m.
|
Invited Talk - Causal Representation Learning: Uncovering the Hidden World
(
Invited Talk - Kun Zhang
)
>
SlidesLive Video |
🔗 |
Fri 2:30 a.m. - 3:00 a.m.
|
Invited Talk - Hammers and Transformers
(
Invited Talk - Piotr Miłoś
)
>
SlidesLive Video |
🔗 |
Fri 3:00 a.m. - 3:30 a.m.
|
Invited Talk - Synthetic Data in Mathematical Reasoning
(
Invited Talk - Albert Q. Jiang
)
>
SlidesLive Video |
🔗 |
Fri 3:30 a.m. - 5:00 a.m.
|
Lunch Break
|
🔗 |
Fri 5:00 a.m. - 5:07 a.m.
|
Contributed Talk: Challenge Track 1-1
(
Contributed Talk: Challenge Track 1-1
)
>
SlidesLive Video |
Zhiyuan Ma 🔗 |
Fri 5:07 a.m. - 5:14 a.m.
|
Contributed Talk: Challenge Track 1-2
(
Contributed Talk: Challenge Track 1-2
)
>
|
Zhiyuan Ma 🔗 |
Fri 5:14 a.m. - 5:21 a.m.
|
Contributed Talk: Challenge Track 2
(
Contributed Talk: Challenge Track 2
)
>
SlidesLive Video |
🔗 |
Fri 5:21 a.m. - 5:28 a.m.
|
Contributed Talk: Challenge Track 3
(
Contributed Talk: Challenge Track 3
)
>
SlidesLive Video |
🔗 |
Fri 5:30 a.m. - 5:37 a.m.
|
PutnamBench: A Multilingual Competition-Mathematics Benchmark for Formal Theorem-Proving
(
Oral Paper Presentation
)
>
link
SlidesLive Video |
George Tsoukalas · Jasper Lee · John Jennings · Jimmy Xin · Michelle Ding · Michael Jennings · Amitayush Thakur · Swarat Chaudhuri 🔗 |
Fri 5:37 a.m. - 5:44 a.m.
|
Progress or Regress? Self-Improvement Reversal in Post-training
(
Oral Paper Presentation
)
>
link
SlidesLive Video |
Ting Wu · Xuefeng Li · Pengfei Liu 🔗 |
Fri 5:45 a.m. - 6:15 a.m.
|
Invited Talk - How and Where Do LLMs Fail on Math? A Multi-Faceted Evaluation on LLM Math Reasoning Skills
(
Invited Talk - Yilun Zhou
)
>
SlidesLive Video |
🔗 |
Fri 6:15 a.m. - 6:45 a.m.
|
Invited Talk - Neural Theorem Proving in Lean 4
(
Invited Talk - Anima Anandkumar
)
>
SlidesLive Video |
🔗 |
Fri 6:45 a.m. - 7:25 a.m.
|
Panel Discussion
(
Panel Discussion
)
>
SlidesLive Video |
🔗 |
Fri 7:25 a.m. - 7:55 a.m.
|
Poster Session
(
Poster Session
)
>
|
🔗 |
Fri 7:55 a.m. - 8:00 a.m.
|
Closing Remarks
(
Closing Remarks
)
>
SlidesLive Video |
🔗 |
-
|
PutnamBench: A Multilingual Competition-Mathematics Benchmark for Formal Theorem-Proving ( Poster ) > link | George Tsoukalas · Jasper Lee · John Jennings · Jimmy Xin · Michelle Ding · Michael Jennings · Amitayush Thakur · Swarat Chaudhuri 🔗 |
-
|
Learning to Reason by Failing: Offline RL on Sub-optimal Rollouts Scales Synthetic Data by 8x ( Poster ) > link | Amrith Setlur · Saurabh Garg · Xinyang Geng · Naman Garg · Virginia Smith · Aviral Kumar 🔗 |
-
|
Lean4trace: Data augmentation for neural theorem proving in Lean ( Poster ) > link | Vasilii Nesterov · Yermek Kapushev · Mikhail Burtsev 🔗 |
-
|
Efficient Linear System Solver with Transformers ( Poster ) > link | Max Vladymyrov · Johannes Von Oswald · Nolan Miller · Mark Sandler 🔗 |
-
|
Large Language Models Can Self-Correct with Minimal Effort ( Poster ) > link | Zhenyu Wu · Qingkai Zeng · Zhihan Zhang · Zhaoxuan Tan · Chao Shen · Meng Jiang 🔗 |
-
|
Teaching Large Language Models to Reason with Reinforcement Learning ( Poster ) > link | Alexander Havrilla · Yuqing Du · Sharath Chandra Raparthy · Christoforos Nalmpantis · Jane Dwivedi-Yu · Eric Hambro · Sainbayar Sukhbaatar · Roberta Raileanu 🔗 |
-
|
AI-Assisted Generation of Difficult Math Questions ( Poster ) > link | Vedant Shah · Anirudh Goyal · Dingli Yu · Kaifeng Lyu · Simon Park · Rosemary Nan Ke · James McClelland · Yoshua Bengio · Sanjeev Arora · Michael Mozer 🔗 |
-
|
AI for an inverse problem: Physical model solving quantum gravity ( Poster ) > link | Koji Hashimoto · Koshiro Matsuo · Masaki Murata · Gakuto Ogiwara · Daichi Takeda 🔗 |
-
|
Specify What? A Case-Study using GPT-4 and Formal Methods For Specification Synthesis ( Poster ) > link | George Granberry · Wolfgang Ahrendt · Moa Johansson 🔗 |
-
|
DACO: Towards Application-Driven and Comprehensive Data Analysis via Code Generation ( Poster ) > link | Xueqing Wu · Rui Zheng · Jingzhen Sha · Te-Lin Wu · Hanyu Zhou · Mohan Tang · Kai-Wei Chang · Nanyun Peng · Haoran Huang 🔗 |
-
|
Distilling LLMs’ Decomposition Abilities into Compact Language Models ( Poster ) > link | Denis Tarasov · Kumar Shridhar 🔗 |
-
|
Progressive-Hint Prompting Improves Reasoning in Large Language Models ( Poster ) > link | Chuanyang Zheng · Zhengying Liu · Enze Xie · Zhenguo Li · Yu Li 🔗 |
-
|
VerityMath: Advancing Mathematical Reasoning by Self-Verification Through Unit Consistency ( Poster ) > link | Vernon Yan Han Toh · Ratish Puduppully · Nancy Chen 🔗 |
-
|
Smart Vision-Language Reasoners ( Poster ) > link | Denisa Roberts · Lucas Roberts 🔗 |
-
|
Progress or Regress? Self-Improvement Reversal in Post-training ( Poster ) > link | Ting Wu · Xuefeng Li · Pengfei Liu 🔗 |
-
|
Pre-Calc: Learning to Use the Calculator Improves Numeracy in Language Models ( Poster ) > link | Vishruth Veerendranath · Vishwa Shah · Kshitish Ghate 🔗 |
-
|
Learning Efficient Recursive Numeral Systems via Reinforcement Learning ( Poster ) > link | Jonathan Thomas · Andrea Silvi · Devdatt Dubhashi · Emil Carlsson · Moa Johansson 🔗 |
-
|
More Details, Please: Improving Autoformalization with More Detailed Proofs ( Poster ) > link | Guillem Tarrach · Albert Jiang · Daniel Raggi · Wenda Li · Mateja Jamnik 🔗 |
-
|
Technical Report for ICML 2024 Automated Math Reasoning Challenge: Solving Optimization Problems with Open Source Large Language Model ( Poster ) > link | Duc M. Nguyen · Sungahn Ko 🔗 |
-
|
Advancing LLM Reasoning Generalists with Preference Trees ( Poster ) > link |
15 presentersLifan Yuan · Ganqu Cui · Hanbin Wang · Ning Ding · Xingyao Wang · Jia Deng · Boji Shan · Huimin Chen · Ruobing Xie · Yankai Lin · Zhenghao Liu · Bowen Zhou · Hao Peng · Zhiyuan Liu · Maosong Sun |
-
|
GeomVerse: A Systematic Evaluation of Large Models for Geometric Reasoning ( Poster ) > link | Mehran Kazemi · Hamidreza Alvari · Ankit Anand · Jialin Wu · Xi Chen · Radu Soricut 🔗 |
-
|
Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving ( Poster ) > link | Aniket Didolkar · Anirudh Goyal · Rosemary Nan Ke · Siyuan Guo · Michal Valko · Timothy Lillicrap · Danilo J. Rezende · Yoshua Bengio · Michael Mozer · Sanjeev Arora 🔗 |