LeanGeo (paper to be published)

LeanGeo is a manually formalized system of plane geometry theorems and their proofs in the Lean 4 proof assistant. It's built upon the axiomatic framework of LeanEuclid, inheriting its foundational geometric objects, relations, and axioms. LeanGeo comprises 260 theorems organized into five major classes: Basic, Metrics, Triangle, Circle, and Quadrilateral, each further refined into numerous subclasses. Additionally, LeanGeo leverages LeanSMT at its core and can support Lean version 4.15.

LeanGeo-Bench

LeanGeo-Bench is the first benchmark tailored for formalizing and proving contest-level plane geometry theorems. As shown in the following table, the benchmark consists of 122 problems drawn from diverse sources, including existing theorem libraries, textbooks, synthetically generated problems, contest problems and all geometry problems from the official IMO problems since 2000.

PARTNUMBERSOURCEMETHOD
UniGeo10LeanEuclidManually Written
Library10LeanGeo LibraryManually Written
Gemini_synthetic20LeanGeo LibraryGenerated by gemini
High_school_level20NuminaMathAutoformalized + double check
Evans_Chen19Evan's BookAutoformalized + double check
IMO_00+43IMOAutoformalized + double check
  1. Requirements
  2. Building
  3. Evaluating on LeanGeo-Bench
  4. Experiments
  5. Acknowledgements
  6. Contributions

Requirements

It is recommended that you run this repo on linux (if you are on windows you can use wsl).

You can use the following command to deploy LeanGeo.

bash setup.sh

Option 2 (Manually Install)

First, you should make sure you have Lean4 with version 4.15.

You will need to install the following packages:

sudo apt update
sudo apt install -y clang libc++-dev

Additionally, your C and CXX compilers both need to be set to clang. Which can be done using the following command

export CC=clang
export CXX=clang++

Building

Take the following steps to build the Lean project:

lake run cvc5/downloadRelease
lake exe cache get
lake build cvc5 SystemE mathlib LeanGeo

Evaluating on LeanGeo-Bench

Evaluation Framwork

We use the evaluation method provided by CombiBench.

Setup Environment

You need to follow these steps to quickly install the required packages.

First, install the lean server on a device with sufficient CPU and memory (Recommended: 64G).

git clone https://github.com/project-numina/kimina-lean-server.git
cd kimina-lean-server
git checkout leangeo

Second, download Combibench on any device.

git clone https://github.com/MoonshotAI/CombiBench.git

Setup a Lean Server

Follow kimina-lean-server to setup a lean server and get a custom url and API_KEY for it.

You can use the following instructions to build a lean server quickly.

cp .env.template .env
docker compose up -d

Check if the lean server is running correctly.

curl --request POST \
  --url http://localhost/verify \
  --header 'Content-Type: application/json' \
  --data '{
    "codes": [
	  {
		"custom_id": "1234",
		"proof": "#check Nat"
	  }
    ],
    "infotree_type": "original"
}' | jq

To shut down the container / view logs:

docker compose down
docker compose logs -f

The default port number of Lean Server is 80. You can also change the port number of lean server in the compose.yaml file to align with the port number required by CombiBench/evaluation.

Setup a LLM API Key

We support API interfaces such as OpenAI, Antropic, TogetherAI, and Google GenerativeAI.

You need to configure your llm client url and api key in geometry_template.json5 and copy the eval_config directory to here, the default config is the url of Kimi K2, you can access Kimi K2's API on https://platform.moonshot.ai , we provide OpenAI/Anthropic-compatible API for you.

Configuration

Refer to geometry_template.json5 to configure the dataset, lean server, llm server, generation parameters, prompt, and parallel parameters.

You can also add custom prompts under the CombiBench/evaluation/config/extra_prompt and configure the path in the prompt_template argument within the config file.

Run Evaluation

To run evaluation on LeanGeo-Bench:

cd CombiBench
python evaluation/cli.py online-one-stage -c evaluation/config/geometry_template.json5

Acknowledgements

  • Our SystemE implementation is heavily inspired by LeanEuclid.
  • However, unlike LeanEuclid we use LeanSMT's builtin translator, caching the SystemE axioms for speed purpose.

Contributions

We would like this to be a community driven repo. Specifically the following areas require development

  1. Expanding the theorem library.
  2. Getting proof reconstruction to work for the esmt tactic.
  3. Improve axiom caching: currently the @[euclid] attribute requires that axioms imports are chained linearly (eg superposition imports metric which imports diagrammatic). If someone could get the caching to work using something like DiscrTree that would be greatly appreciated.
  4. Mathlib comptability: in the to415_mathlib branch we define all SystemE primitives and axioms in terms of complex numbers. Sorry filling for SystemE axioms would be greatly appreciated. The smt solver interface is also broken on that branch so if any could implement monomorphization that would be great too.