## Autoformalizing Euclidean Geometry

Code for the paper:

Autoformalizing Euclidean Geometry

International Conference on Machine Learning (ICML), 2024

Logan Murphy*, Kaiyu Yang* (* equal contribution), Jialiang Sun, Zhaoyu Li, Anima Anandkumar, and Xujie Si

```
@inproceedings{murphy2024leaneuclid,
title={Autoformalizing {Euclidean} Geometry},
author={Murphy, Logan and Yang, Kaiyu and Sun, Jialiang and Li, Zhaoyu and Anandkumar, Anima and Si, Xujie},
booktitle={International Conference on Machine Learning (ICML)},
year={2024}
}
```

### Quick Links

- Requirements
- Building
- The Formal System E
- LeanEuclid
- Evaluating Autoformalized Theorem Statements
- Experiments
- Acknowledgements

### Requirements

- A working setup of Lean 4, including elan and Lean's VSCode extension
- Install the latest version of Z3 and CVC5 and make sure they can be accessed from the command line
- Install Python dependencies:
`pip install smt-portfolio openai`

- Find out the location of
`smt-portfolio`

and make sure Lean's VSCode extension can also access it. For example, if`which smt-portfolio`

outputs`/Users/yangky/miniconda3/envs/lean/bin/smt-portfolio`

, you should set`Server Env Paths`

in Lean's VSCode extension as below:

### Building

Take the following steps to build the Lean project:

- Run
`lake script run check`

to check if the requirements are satisfied. - Run
`lake exe cache get`

to download the mathlib cache - Run
`lake build`

to compile the formal system E - Open a file for Euclid's
*Elements*in VS Code, e.g., Book/Prop01.lean. You should expect to see:

Optionally, you can:

- Run
`lake -R -Kenv=dev build SystemE:docs`

to build the documentation - View the auto-generated documentation locally at ./lake/build/doc, e.g., using
`python -mhttp.server`

If you have problems building the project, Dockerfile and scripts/build.sh may serve as useful references.

### The Formal System E

E is a formal system introduced by Avigad et al., 2009 for faithfully formalizing the proofs in Euclid’s *Elements*, including the use of diagrammatic reasoning. It defines a language for expressing basic objects in 2D Euclidean geometry (points, lines, circles, etc.), as well as the form of theorems and proofs. We implement the formal system E in Lean and use SMT solvers to perform diagrammatic reasoning automatically and implicitly. Details of our implementation can be found here and in the auto-generated documentation.

### LeanEuclid

LeanEuclid is a benchmark for testing autoformalization. It consists of 173 Euclidean geometry problems manually formalized into Lean. The theorems and proofs are in the format prescribed by the formal system E. Among the 173 problems, 48 are from Euclid's Elements, and 125 are adapted from the UniGeo dataset (Chen et al., 2022).

#### Euclid's *Elements*

We manually formalized the first book of Euclid's *Elements*, using an open-source LaTex version. The formalized theorems and proofs can be found here. Thanks to E and automatic diagrammatic reasoning, our formalized proofs are "faithful" in the sense that they correspond very closely to Euclid's original proofs. To our knowledge, this is the first time Euclid's *Elements* has been faithfully formalized in a proof assistant such as Lean, and our formalization has uncovered errors in Euclid's proofs (Appendix B of our paper).

For example, below is our formalization of Euclid's first proposition in Book I, which states you can construct an equilateral triangle given two distinct points:

Our formalized theorem and proof in Book/Prop01.lean:

```
theorem proposition_1 :
∀ (a b : Point) (AB : Line),
distinctPointsOnLine a b AB →
∃ c : Point, |(c─a)| = |(a─b)| ∧ |(c─b)| = |(a─b)| := by
euclid_intros
euclid_apply circle_from_points a b as BCD
euclid_apply circle_from_points b a as ACE
euclid_apply intersection_circles BCD ACE as c
euclid_apply point_on_circle_onlyif a b c BCD
euclid_apply point_on_circle_onlyif b a c ACE
use c
euclid_finish
```

#### UniGeo

Problems from UniGeo are generally easier than those from *Elements*. Unlike *Elements*, UniGeo's problem text does not include complete information about the problem, so we manually provide missing diagrammatic details to the text. For example:

Diagram in UniGeo/Congruent/diagrams/1.png:

UniGeo's theorem statement in UniGeo/Congruent/texts/1.txt.

```
Given T U ≅ R S. R S ∥ T U. Complete the proof that △ R T U ≅ △ T R S.
```

UniGeo's proof in UniGeo/Congruent/proofs/1.txt:

```
RS ∥ TU
TU ≅ RS
∠RTU ≅ ∠SRT
RT ≅ RT
△RTU ≅ △TRS
```

Diagrammatic details we provide:

```
There is a quadrilateral RSUT with vertices R, S, U, and T. There is a diagonal line segment RT drawn inside the quadrilateral, connecting vertices R and T.
```

Our formalized theorem and proof in UniGeo/Congruent/Thm01.lean:

```
theorem theorem_1 : ∀ (R S T U : Point) (RS ST RT TU RU : Line),
formTriangle R S T RS ST RT ∧
formTriangle R T U RT TU RU ∧
S.opposingSides U RT ∧
|(T─U)| = |(R─S)| ∧
¬ RS.intersectsLine TU →
(△ R:T:U).congruent (△ T:R:S) :=
by
euclid_intros
euclid_apply Elements.Book1.proposition_29''' S U R T RS TU RT
euclid_finish
```

### Evaluating Autoformalized Theorem Statements

In addition to proof automation, our symbolic reasoning engine can be used to check the equivalence of autoformalized theorem statements against ground-truth theorem statements. We have built a wrapper `E3`

which performs two forms of equivalence checking:

- Standard equivalence checking, i.e., check if the two theorem statements are logically equivalent or not. It may also be that one is strictly stronger than the other.
*Approximate*equivalence checking, where we try to quantify how close the two statements are to one another.

Typically, the latter is only done if the statements are not proven equivalent, but you'd like to see how "close" the prediction was to the ground truth. More details on `E3`

can be found here.

### Experiments

In our paper, we used LeanEuclid to test state-of-the-art LLMs on theorem statements and proof autoformalization. We provide a Python wrapper which can be used to replicate our experiment procedure. More details and usage examples can be found here.

### Acknowledgements

- We use (our own fork of) lean-smt for running SMT solvers from Lean.
- There are concurrent efforts on formalizing Euclidean geometry in Lean (e.g., EG and Hernandez-Espiet's work). To the best of our knowledge, LeanEuclid is unique in its use of external solvers to handle diagrammatic reasoning, which is typically performed explicitly. However, this comes at the expense of breaking soundness.