Lean Copilot: LLMs as Copilots for Theorem Proving in Lean
Lean Copilot allows large language models (LLMs) to be used in Lean for proof automation, e.g., suggesting tactics/premises and searching for proofs. You can use our built-in models from LeanDojo or bring your own models that run either locally (w/ or w/o GPUs) or on the cloud.
https://github.com/lean-dojo/LeanCopilot/assets/114432581/ee0f56f8-849e-4099-9284-d8092cbd22a3
Table of Contents
- Requirements
- Using Lean Copilot in Your Project
- Advanced Usage
- Caveats
- Getting in Touch
- Acknowledgements
- Citation
Requirements
- Supported platforms: Linux, macOS, and Windows WSL; ⚠️ Native Windows currently not supported.
- Git LFS
- Optional (recommended if you have a CUDA-enabled GPU): CUDA and cuDNN
- Required for building Lean Copilot itself (rather than a downstream package): CMake >= 3.7 and a C++17 compatible compiler
Using Lean Copilot in Your Project
⚠️ Your project must use a Lean version of at least lean4:v4.3.0-rc2
.
Adding Lean Copilot as a Dependency
- Add the package configuration option
moreLinkArgs := #["-L./.lake/packages/LeanCopilot/.lake/build/lib", "-lctranslate2"]
to lakefile.lean. For example,
package «my-package» {
moreLinkArgs := #[
"-L./.lake/packages/LeanCopilot/.lake/build/lib",
"-lctranslate2"
]
}
- Add the following line to lakefile.lean, including the quotation marks:
require LeanCopilot from git "https://github.com/lean-dojo/LeanCopilot.git" @ "LEAN_COPILOT_VERSION"
LEAN_COPILOT_VERSION
depends on your lean-toolchain:
lean-toolchain | Recommended Lean Copilot version |
---|---|
v4.11.0 | v1.6.0 |
v4.11.0-rc3 | v1.5.3 |
v4.11.0-rc2 | v1.5.2 |
v4.11.0-rc1 | v1.5.1 |
v4.10.0 | v1.5.0 |
v4.10.0-rc2 | v1.4.2 |
v4.10.0-rc1 | v1.4.1 |
v4.9.0 | v1.4.0 |
v4.9.0-rc3 | v1.3.3 |
v4.9.0-rc2 | v1.3.2 |
v4.9.0-rc1 | v1.3.1 |
v4.8.0 | v1.3.0 |
v4.8.0-rc2 | v1.2.2 |
v4.8.0-rc1 | v1.2.1 |
v4.7.0 | v1.2.0 |
v4.7.0-rc2 | v1.1.2 |
v4.6.0-rc1 | v1.1.1 |
v4.5.0 | v1.1.0 |
v4.5.0-rc1 | v1.1.0 |
-
Run
lake update LeanCopilot
. -
Run
lake exe LeanCopilot/download
to download the built-in models from Hugging Face to~/.cache/lean_copilot/
. Alternatively, you can download the models from Hugging Face manually from
"https://huggingface.co/kaiyuy/ct2-leandojo-lean4-tacgen-byt5-small"
"https://huggingface.co/kaiyuy/ct2-leandojo-lean4-retriever-byt5-small"
"https://huggingface.co/kaiyuy/premise-embeddings-leandojo-lean4-retriever-byt5-small"
"https://huggingface.co/kaiyuy/ct2-byt5-small"
- Run
lake build
.
Here is an example of a Lean package depending on Lean Copilot. If you have problems building the project, our Dockerfile, build.sh or build_example.sh may be helpful.
Getting Started with Lean Copilot
Tactic Suggestion
After import LeanCopilot
, you can use the tactic suggest_tactics
to generate tactic suggestions. You can click on any of the suggested tactics to use it in the proof.
You can provide a prefix (e.g., simp
) to constrain the generated tactics:
Proof Search
The tactic search_proof
combines LLM-generated tactics with aesop to search for multi-tactic proofs. When a proof is found, you can click on it to insert it into the editor.
Premise Selection
The select_premises
tactic retrieves a list of potentially useful premises. Currently, it uses the retriever in LeanDojo to select premises from a fixed snapshot of Lean and mathlib4.
Running LLMs
You can also run the inference of any LLMs in Lean, which can be used to build customized proof automation or other LLM-based applications (not limited to theorem proving). It's possible to run arbitrary models either locally or remotely (see Bring Your Own Model).
Advanced Usage
This section is only for advanced users who would like to change the default behavior of suggest_tactics
, search_proof
, or select_premises
, e.g., to use different models or hyperparameters.
Tactic APIs
- Examples in TacticSuggestion.lean showcase how to configure
suggest_tactics
, e.g., to use different models or generate different numbers of tactics. - Examples in ProofSearch.lean showcase how to configure
search_proof
using options provided by aesop. - Examples in PremiseSelection.lean showcase how to set the number of retrieved premises for
select_premises
.
Model APIs
Examples in ModelAPIs.lean showcase how to run the inference of different models and configure their parameters (temperature, beam size, etc.).
Lean Copilot supports two kinds of models: generators and encoders. Generators must implement the TextToText
interface:
class TextToText (τ : Type) where
generate (model : τ) (input : String) (targetPrefix : String) : IO $ Array (String × Float)
input
is the input stringtargetPrefix
is used to constrain the generator's output.""
means no constraint.generate
should return an array ofString × Float
. EachString
is an output from the model, andFloat
is the corresponding score.
We provide three types of Generators:
NativeGenerator
runs locally powered by CTranslate2 and is linked to Lean using Foreign Function Interface (FFI).ExternalGenerator
is hosted either locally or remotely. See Bring Your Own Model for details.GenericGenerator
can be anything that implements thegenerate
function in theTextToText
typeclass.
Encoders must implement TextToVec
:
class TextToVec (τ : Type) where
encode : τ → String → IO FloatArray
input
is the input stringencode
should return a vector embedding produced by the model.
Similar to generators, we have NativeEncoder
, ExternalEncoder
, and GenericEncoder
.
Bring Your Own Model
In principle, it is possible to run any model using Lean Copilot through ExternalGenerator
or ExternalEncoder
(examples in ModelAPIs.lean). To use a model, you need to wrap it properly to expose the APIs in external_model_api.yaml. As an example, we provide a Python API server and use it to run a few models, including llmstep-mathlib4-pythia2.8b.
Caveats
- Lean may occasionally crash when restarting or editing a file. Restarting the file again should fix the problem.
select_premises
always retrieves the original form of a premise. For example,Nat.add_left_comm
is a result of the theorem below. In this case,select_premises
retrievesNat.mul_left_comm
instead ofNat.add_left_comm
.
@[to_additive]
theorem mul_left_comm : ∀ a b c : G, a * (b * c) = b * (a * c)
- In some cases,
search_proof
produces an erroneous proof with error messages likefail to show termination for ...
. A temporary workaround is changing the theorem's name before applyingsearch_proof
. You can change it back aftersearch_proof
completes.
Getting in Touch
- For general questions and discussions, please use GitHub Discussions.
- To report a potential bug, please open an issue. In the issue, please include your OS information and the exact steps to reproduce the error. The more details you provide, the better we will be able to help you.
- Feature requests and other suggestions are extremely welcome. Please feel free to start a discussion!
Acknowledgements
- We thank Scott Morrison for suggestions on simplifying Lean Copilot's installation and Mac Malone for helping implement it. Both Scott and Mac work for the Lean FRO.
- We thank Jannis Limperg for supporting our LLM-generated tactics in Aesop (https://github.com/leanprover-community/aesop/pull/70).
Citation
If you find our work useful, please consider citing our paper:
@misc{song2024largelanguagemodelscopilots,
title={Towards Large Language Models as Copilots for Theorem Proving in Lean},
author={Peiyang Song and Kaiyu Yang and Anima Anandkumar},
year={2024},
eprint={2404.12534},
archivePrefix={arXiv},
primaryClass={cs.AI},
url={https://arxiv.org/abs/2404.12534},
}