Learning Formal Mathematics From Intrinsic Motivation
Date and time: 27 August 2024, 10:00-10:50 CEST (UTC +2)
Speaker: Gabriel Poesia, Stanford University
Title: Learning Formal Mathematics From Intrinsic Motivation
Where: Digital Futures hub, Osquars Backe 5, floor 2 at KTH main campus OR Zoom
Directions: https://www.digitalfutures.kth.se/contact/how-to-get-here/
OR
Zoom: https://kth-se.zoom.us/j/61232476718
Host: David Broman
Abstract: If AI is to help with new mathematics, it will need to reason in mathematical domains where no human training data exists yet. In this talk, we describe our recent work exploring the Platonic view that mathematics can be discovered from its axioms – a game of conjecture and proof. We present Minimo (Mathematics from Intrinsic Motivation): an agent that jointly learns to pose challenging problems for itself (conjecturing) and solve them (theorem proving). Given a mathematical domain axiomatized in dependent type theory, we first combine methods for constrained decoding and type-directed synthesis to sample valid conjectures from a language model.
Our method guarantees well-formed conjectures by construction, even as we start with a randomly initialized model. We use the same model to represent a policy and value function for guiding proof searches. Our agent targets generating hard but provable conjectures – a moving target since its own theorem-proving ability also improves as it trains. We propose novel methods for hindsight relabeling on proof search trees to improve the agent’s sample efficiency in both tasks significantly. Experiments on 3 axiomatic domains (propositional logic, arithmetic and group theory) demonstrate that our agent can bootstrap from only the axioms, self-improving in generating true and challenging conjectures and in finding proofs. Preprint: https://arxiv.org/abs/2407.00695
Bio: Gabriel Poesia is a PhD student at Stanford University, working under Noah Goodman. His work focuses on the intersection of Reinforcement Learning and formal mathematical reasoning, including theorem proving, conjecturing and learning abstractions.