Automated Theory Construction
Automated Theory Construction (ATC) is a Lean 4 workflow for building verified theory from a small axiom base. Instead of aiming at one hand-picked theorem, the system generates candidate statements, formalizes them, verifies them in Lean, and accumulates successful results into a growing derived theory.

Core Idea
Do not aim directly at the final theorem. Generate the surrounding structure until the theorem becomes inevitable.
The main loop works like this:
- Start from a base theory in
AutomatedTheoryConstruction/Theory.lean. - Generate local candidate statements from the current theory state.
- Attempt formalization and proof in Lean.
- Append verified results to
AutomatedTheoryConstruction/Derived.lean. - Recycle failed attempts into refined follow-up problems.
This is theory construction rather than ordinary proof search: the system expands the space of statements as it works.
Example Artifact
A concrete generated Lean artifact is available here:
This gist shows the kind of output ATC accumulates: a large Lean file of generated and verified statements over the Lambek-calculus-based theory used in this repository.
Quick Start
The recommended end-to-end path is:
- Put your deep-research report under
materials/. Gemini Deep Research is the recommended default for this step. - Build the materials cache.
- Regenerate
AutomatedTheoryConstruction/research_agenda.mdfrom that report. - Run the main seed -> loop -> refactor pipeline.
Prerequisites:
- Lean toolchain from
lean-toolchain - Lake + Mathlib dependencies
- Python
uv
Run:
make build
make materials-cache
make research-agenda REPORT_FILE=materials/your_report.md
make seed-loop-refactor-derived
This builds the project, refreshes data/materials_cache, writes AutomatedTheoryConstruction/research_agenda.md, and runs the main loop plus whole-file refactor path on Derived.lean.
For subsequent iterations after the first make seed-loop-refactor-derived, prefer make loop-continue-refactor-derived so the loop/refactor cycle continues from the current runtime state instead of resetting it.
If you want to continue only the loop without the refactor stages, use make loop-continue.
If you only want to refresh derived materials/ artifacts without fetch/extract, use make materials-derive.
If you want the fastest smoke path without Codex CLI, you can still run:
uv run python scripts/atc_cli.py loop \
--worker-command "uv run scripts/mock_worker.py" \
--max-iterations 1
Documentation
Start with the doc hub: docs/README.md
| If you want to... | Read |
|---|---|
| Set up the repo and do a first run | docs/GETTING_STARTED.md |
| Run the loop day to day | docs/USER_GUIDE.md |
| Know what files are safe to edit | docs/REPO_MAP.md |
| Swap the Lean verification backend | docs/PROOF_EXECUTOR.md |
| See implementation-oriented runtime notes | design/RUNTIME.md |
Repository Shape
AutomatedTheoryConstruction/Theory.lean: entry point for the active base theoryAutomatedTheoryConstruction/Theory/*.lean: optional local theory modulesAutomatedTheoryConstruction/Derived.lean: accumulated verified theoremsAutomatedTheoryConstruction/Scratch.lean: temporary verification targetAutomatedTheoryConstruction/research_agenda.md: persistent guidance for problem selectionmaterials/: recommended place to keep organized deep-research outputs, literature summaries, source-link lists, and problem-seed notes used as optional external contextprompts/research_agenda/: templates for turning deep-research reports into strictresearch_agenda.mddraftsscripts/atc_cli.py: unified operational CLI
materials/ is the recommended home for deep research that you want the system to reuse later.
Treat it as external research context, not as part of the core runtime state: the loop may consult it for seed generation, prioritization, and expansion, but it should not be folded into theory_state.json.
Also treat summary reports in materials/ as potentially time-sensitive: they are useful for context, but source-link lists or primary papers should win when novelty or closest-known-result judgment matters.
To regenerate AutomatedTheoryConstruction/research_agenda.md from a deep-research report, use:
make research-agenda REPORT_FILE=materials/your_report.md
Refactor Pipeline
The post-loop refactor path is intentionally staged:
- Alpha-equivalent theorem dedupe on the preview copy
- Whole-file rewrite cleanup (
rewrite, pass 1.5) - Whole-file review-focused cleanup (
review, pass 2.0) - Copy the reviewed result back into
Derived.lean - Recheck the main Lean targets against the updated derived theory
This keeps global cleanup separate from proof search while still ending on a verified Derived.lean.
License
This repository is licensed under the MIT License. See LICENSE.
Acknowledgements
The prompting strategy for solving Lean problems was partially inspired by a private repository, kmd710/lean4-codex-skills.
This repository also includes material adapted from: