leanaide

LeanAide or LeanAIde (accidental pun) is work in progress to build AI based tools to help development with the Lean Theorem Prover. The core technique we use is Autoformalization which is the process of translating informal mathematical statements to formal statements in a proof assistant. This is used both directly to provide tools and indirectly to generate proofs by translation from natural language proofs generated by Large Language models.

We now (as of early February 2025) have a convenient way to use LeanAide in your own projects, using a server-client setup which we outline below.

Example usage

The most convenient way to use LeanAide is with syntax we provide that gives code-actions. We have syntax for translating theorems and definitions from natural language to Lean, and for adding documentation strings to theorems and definitions. For example, the following code in a Lean file (with correct dependencies) will give code actions:

import LeanAideTools
import Mathlib

#theorem "There are infinitely many odd numbers"

#def "A number is defined to be cube-free if it is not divisible by the cube of a prime number"

#doc
theorem InfiniteOddNumbers : {n  ∣  Odd n}.Infinite := by sorry

We also provide syntax for generating/completing proofs. For now this is slow and not of good quality. Experiments and feedback are welcome. The first of the examples below uses a command and the second uses a tactic.

#prove "The product of two successive natural numbers is even"

theorem eg₁ (n: Nat) : 2 ∣ n * (n + 1) := by
  #prove

Setting up LeanAide

To either experiment directly with LeanAide or use it via the server-client setup as a dependency in your own project, you need to first install LeanAide.

First clone the repository. Next, from the root of the repository, run the following commands to build and fetch pre-loaded embeddings:

lake exe cache get # download prebuilt mathlib binaries
lake build mathlib
lake build
lake exe fetch_embeddings

Our translation is based on GPT 4o from OpenAI, but you can use alternative models (including local ones). Note that we also use embeddings from OpenAI, so you will need an OpenAI API key unless you set up an example server as descried below. Here, we assume you have an OpenAI API key.

To get started please configure environment variables using the following bash commands or equivalent in your system at the base of the repository (the angle brackets are not to be included in the command), and then launch VS code.

export OPENAI_API_KEY=<your-open-ai-key>
code .

Alternatively, you can create a file with path private/OPENAI_API_KEY (relative to the root of LeanAIDE) containing your key.

After this open the folder in VS code (or equivalent) with Lean 4 and go to the file LeanCodePrompts/TranslateDemo.lean. Any statement written using syntax similar to #theorem "There are infinitely many primes" will be translated into Lean code. You will see examples of this in the demo files. Once the translation is done, a Try this hyperlink and code-action will appear. Clicking on this will add the translated code to the file.

Alternatively, you can translate a statement using the following command in the Lean REPL:

lake exe translate "There are infinitely many primes"

Using as a dependency: server-client setup

Adding LeanAide as a dependency to your project is brittle and slow. In particular, toolchains need to match exactly. Further, there may be a collision of transitive dependencies.

We provide a server-client setup that is more robust and faster, based on the zero-dependency project LeanAideTools. This acts as a client to the server in this repository.

The simplest (and we expect the most common) way to set up LeanAide to use in your project is to run a server locally. To do this, first set up LeanAide as described above. Then run the following command from the root of the repository:

python3 leanaide_server.py

We assume you have Python 3 installed but we only use builtin packages. If you wish to use a different model supporting the OpenAI API, you can pass parameters as in the following example where we use the Mistral API. To use a local model, you can run with the OpenAI Chat API using, for example vLLM and give a local url.

 python3 leanaide_server.py --url https://api.mistral.ai/v1/chat/completions --auth_key <Mistral API key> --model  "mistral-small-latest"

Next, add LeanAideTools as a dependency to your project. This can be done by adding the following line to your lakefile.toml file:

[[require]]
name = "LeanAideTools"
git = "https://github.com/siddhartha-gadgil/LeanAideTools.git"
rev = "main"

If you use a lakefile.lean file, you can add the following line to the file:

require LeanAideTools from git "https://github.com/siddhartha-gadgil/LeanAideTools"@"main"

Now you can use the LeanAideTools package in your project. The following is an example of how to use it:

import LeanAideTools

#theorem "There are infinitely many primes"

Hosting a server

The server is by default launched at http://localhost:7654. This can be customized by setting the variables HOST and LEANAIDE_PORT. To use a shared server, simply set HOST to the IP of the hosting machine. In the client, i.e., the project with LeanAideTools as a dependency, add the configuration:

set_option leanaide.url <hosts IP address>:7654

to use the host. Further, unless you want all clients to use the hosts authentication for OpenAI or whatever is the model used, the host should not specify an authentication key (better still, start the server with --auth_key "A key that will not work"). The client should then supply the authentication key with

set_option leanaide.authkey? "<authentication-key>"

There are many other configurations at the server and client end.

Using different (non OpenAI) embeddings

If you wish to use embeddings different from the OpenAI embeddings, for example for a fully open-source solution, you can run an "examples server" and set the parameter examples_url to point to it. We provide a basic implementation of this in the example_server directory. To start this, run the following:

cd example_server
python3 prebuild_embeddings
flask run

The second command should be run only the first time you set up. Once you start the server, set examples_url = "localhost:5000/find_nearest" to use the embedding server.

The file example_server/README.md sketches the configuration options and also the protocol in case you wish to make a fully customized embedding database.

Contributions and details

The principal author of this repository is Siddhartha Gadgil.

The first phase of this work (roughly June 2022-October 2023) was done in collaboration with:

Anand Rao Tadipatri
Ayush Agrawal
Ashvni Narayanan
Navin Goyal

We had a lot of help from the Lean community and from collaborators at Microsoft Research. Our server is hosted with support from research credits from Google.

More recently (since about October 2024) the work has been done in collaboration with:

Anirudh Gupta
Vaishnavi Shirsath
Ajay Kumar Nair
Malhar Patel
Sushrut Jog

This is supported by ARCNet, ART-PARK, IISc.

Our work is described in a note at the 2nd Math AI workshop and in more detail (along with related work) in a preprint.