INDUCTION: Finite-Structure Concept Synthesis in First-Order Logic
Serafim Batzoglou
Abstract
bstract: Large language and reasoning models can be prompted to generate well-formed first-order formulas, but we still lack evaluations of their ability to produce correct, compact explanations under fully specified, mechanically checkable semantics. We study finite-structure concept synthesis: given several small finite relational worlds that are labeled extensionally with a unary target predicate $T(x)$, the learner must output a single first-order formula $\varphi(x)$ that recovers (explains) $T$ uniformly across worlds. Because the domains are finite, correctness is solver-verifiable via exact model checking and SMT. We introduce INDUCTION, a benchmark suite that (to our knowledge) provides the first challenging, end-to-end evaluation of first-order definition synthesis from extensional relational evidence in a fully abstract interface. INDUCTION includes three tightly related regimes---FullObs (full observation), CI (contrastive Yes/No worlds), and EC (partial observation under existential completion)---and reports gold-relative, budgeted metrics that penalize formula bloat. Across tasks we observe sharp difficulty gradients and persistent hard structural families; moreover, held-out world evaluation shows that among training-correct solutions, low-bloat formulas generalize far better than highly bloated ones, motivating bloat-aware scoring as a first-class metric for symbolic induction.
Successful Page Load