[{"data":1,"prerenderedAt":4},["ShallowReactive",2],{"bZvpU7gM4K":3},"# EVM-Smith\n\n**A framework for AI systems to write EVM bytecode and prove it safe.**\n\n> ⚠️ **Experimental research codebase.** Not audited, not production-ready.\n> Don't deploy code based on this repo to a live chain. The proofs ship as\n> a research artifact demonstrating the workflow; the deployments themselves\n> are demos.\n\nThe goal is to experiment with AI-generated smart contracts bypassing compilers\nentirely: an AI writes a contract directly in EVM assembly and, in the same\nworkflow, writes Lean 4 proofs about the contract's behavior against the\nofficial EVM semantics. This repo is the scaffolding — a thin Lean framework\nthat makes the \"write + prove\" loop ergonomic enough to automate.\n\nThe EVM semantics come from\n[`NethermindEth/EVMYulLean`](https://github.com/NethermindEth/EVMYulLean),\na Lean 4 formalization of the Ethereum Yellow Paper. EVM-Smith is a\nconsumer of that formalization.\n\nThis repo's `EVMYulLean/` is a submodule pointing at the\n[`leonardoalt/EVMYulLean`](https://github.com/leonardoalt/EVMYulLean)\nfork, which carries the **Frame library** — cross-transaction\npreservation infrastructure that lifts a single-contract bytecode\nwalk into a per-account inductive invariant that survives the entire\n`Υ` driver, including arbitrary reentrancy, nested CREATE / CREATE2,\nand SELFDESTRUCT. The library supports balance lower bounds,\nrelational solvency-style bounds (`Σ storage ≤ balance`),\naccount-presence preservation, code-identity preservation, and other\nper-account state-shape invariants — see\n[`FRAME_LIBRARY.md`](https://github.com/leonardoalt/EVMYulLean/blob/main/FRAME_LIBRARY.md)\nfor the full surface. The two worked examples (Register's balance\nmonotonicity and WETH's solvency) are entirely consumers of this\nlibrary.\n\n## Proof status\n\n* **0 sorries** anywhere in the codebase or its dependency\n  (EVMYulLean, including the Frame library).\n* **2 practical axioms beyond Lean's standard foundations** in the\n  entire trust base: `precompile_preserves_accountMap` (T2:\n  precompile purity, provable by case inspection) and\n  `lambda_derived_address_ne_C` (T5: Keccak collision-resistance,\n  the standard cryptographic ground assumption every Ethereum\n  security argument relies on). Both are documented in\n  [`AXIOMS.md`](./AXIOMS.md). Verify with `#print axioms \u003Ctheorem>`.\n* Every claim about a contract's bytecode behaviour is a **proved theorem**.\n\nThe proofs are conditional on a small, explicit set of structural\nhypotheses spelled out per demo (e.g. WETH's 5-field\n`WethAssumptions` bundle). See [`TRUST_ASSUMPTIONS.md`](./TRUST_ASSUMPTIONS.md)\nfor the broader picture.\n\n## How it's meant to be used\n\n1. Write a program as a `Program` value — a list of `(opcode, optional\n   push-arg)` pairs. Optionally emit the raw bytecode as a\n   `ByteArray`.\n2. Run it against an `EVM.State` (`runSeq`) to get empirical behavior.\n3. State safety properties (functional correctness, invariants,\n   error-freeness) as Lean theorems and prove them with the upstream\n   semantics as ground truth.\n\n## Project layout\n\n```\nEvmSmith/\n├── Framework.lean            # mkState, withCalldata, runOp, runOpFull, runSeq, Program\n├── Lemmas.lean               # Per-opcode step lemmas + runSeq fusion — reusable across programs\n├── Demos/                    # Worked examples — see Demos/README.md\n│   ├── Main.lean             # lake exe entrypoint: runs all demos\n│   ├── Demos.lean            # IO demos\n│   ├── DemoProofs.lean       # Single-opcode safety theorems\n│   ├── Add3/                 # Arithmetic correctness\n│   ├── Register/             # Storage + reentrancy + balance monotonicity\n│   ├── Weth/                 # Solvency invariant\n│   └── Tests.lean            # #guard assertions evaluated at elaboration time\n```\n\nWhen an AI adds a new contract, the natural place is\n`EvmSmith/Demos/MyContract/Program.lean` (the program) and\n`EvmSmith/Demos/MyContract/Proofs.lean` (its safety theorems).\n`Framework.lean` is the runtime surface; `Lemmas.lean` is the\nproof-time surface — extend it with one `runOp_\u003Copcode>` lemma per new\nopcode your program uses.\n\n## Demos, proofs, tests\n\nPer-demo specifics — what's already proved, how to run each demo,\nhow to check the proofs, how to run the unit tests, how to run the\nFoundry tests, how to regenerate bytecode artifacts, how to write\nyour own program + proof — all live in\n[**`EvmSmith/Demos/README.md`**](./EvmSmith/Demos/README.md).\n\n## Assumptions\n\nThe proofs in this repo are conditional on a small, explicit set of\nassumptions. Two documents spell them out:\n\n- [**`AXIOMS.md`**](./AXIOMS.md) — the two explicit `axiom`\n  declarations in the EVMYulLean framework (T2: precompile purity;\n  T5: Keccak collision resistance). What each says, why it's stated\n  as an axiom, and the path to discharging it.\n\n- [**`TRUST_ASSUMPTIONS.md`**](./TRUST_ASSUMPTIONS.md) — the broader\n  trust picture: Lean's logical foundations, EVMYulLean's modeling\n  fidelity (definitional faithfulness, pre-Cancun SELFDESTRUCT\n  semantics, gas accounting, partial-correctness framing), and the\n  per-contract structural-fact pattern (`DeployedAtC`, SD-exclusion,\n  liveness at dispatch, boundary conditions, chain-state bounds).\n\nPer-demo specifics on which structural facts each proof requires are\nin the demo's own report (e.g.\n[`EvmSmith/Demos/Weth/REPORT_WETH.md`](./EvmSmith/Demos/Weth/REPORT_WETH.md)).\n\n## Requirements\n\n- [`elan`](https://github.com/leanprover/elan) (Lean version manager).\n  The toolchain pinned in `lean-toolchain` (currently\n  `leanprover/lean4:v4.22.0`) downloads automatically on first build.\n- A working C compiler (`cc` on `PATH`) — the upstream needs it for\n  keccak / SHA256 / elliptic-curve FFI.\n- Network access on first build (to fetch Mathlib, `amosnier/sha-2`,\n  `brainhub/SHA3IUF`).\n- ~2 GB free disk (most of it Mathlib).\n\n## Building\n\nClone with submodules:\n\n```bash\ngit clone --recursive https://github.com/leonardoalt/evm-smith.git\ncd evm-smith\nlake build\n```\n\n(If you forgot `--recursive`, run `git submodule update --init --recursive`\ninside the repo to fetch them.)\n\nSubmodules pulled:\n- `EVMYulLean/` — the [`leonardoalt/EVMYulLean`](https://github.com/leonardoalt/EVMYulLean) fork carrying the Frame library\n  ([`FRAME_LIBRARY.md`](https://github.com/leonardoalt/EVMYulLean/blob/main/FRAME_LIBRARY.md)).\n  The NethermindEth upstream alone won't satisfy the imports.\n- Each demo's `foundry/lib/forge-std/` — for the Foundry test suites.\n\nFirst build is cold: toolchain download (~200 MB), Mathlib build, C\ncrypto compile. Budget 10–30 minutes depending on network and CPU.\nIncremental builds are seconds.\n\n## Using the framework\n\nMinimal example — run `ADD` on a two-element stack and inspect the\ntop:\n\n```lean\nimport EvmSmith.Framework\nopen EvmSmith EvmYul\n\ndef example : Option UInt256 :=\n  topOf \u003C| runOp .ADD (mkState [UInt256.ofNat 10, UInt256.ofNat 32])\n  -- some 42\n\n#eval example\n```\n\nTwo runners are available:\n\n- `runOp` — uses the pure `EvmYul.step`. No fuel, no gas, no\n  `execLength` bump. Preferred for proofs because the post-state\n  stays minimal.\n- `runOpFull` — uses the production `EVM.step` with `fuel := 1`,\n  `gasCost := 0`. Agrees with `runOp` on `stack` and `pc`. Use this\n  to confirm parity with the full driver.\n\nBoth return `Except EVM.ExecutionException EVM.State`. Sequence\nmultiple opcodes with `runSeq : Program → EVM.State →\nExcept _ EVM.State`.\n\n## Limitations\n\n- **Bytes-level round-trips** (e.g. `MSTORE` → `RETURN` producing the\n  bytes of `a + b + c`) go through `ffi.ByteArray.zeroes`, which is\n  `opaque`. Proofs that need it would require an axiomatized\n  round-trip lemma.\n- **Upstream Batteries gaps for storage-slot reasoning** — the\n  derived `Ord UInt256` doesn't carry `LawfulOrd`, and Batteries has\n  no `find?_erase_*` lemmas on `RBMap`. We register the needed\n  `OrientedCmp`/`TransCmp`/`ReflCmp` instances locally\n  (`EvmSmith/Lemmas/UInt256Order.lean`) and proved\n  `find?_erase_ne` plus a list-level erase characterisation\n  directly through `RBNode.del`\n  (`EvmSmith/Lemmas/BalanceOf.lean`, `Lemmas/RBMapSum.lean`,\n  `EVMYulLean/EvmYul/Frame/StorageSum.lean`). Storage-sum reasoning\n  works (the WETH solvency proof depends on it). See\n  `.claude/batteries-wishlist.md` for the upstream PRs that would\n  let us delete these workarounds.\n- **Partial correctness, not termination.** Theorems claim safety on\n  successful runs (`Υ` returns `.ok`); failure paths (out-of-gas,\n  REVERT, invalid opcode) leave the conclusion vacuous\n  (`.error _ => True`). Gas is fully tracked in EVMYulLean (Yellow\n  Paper Appendix G), so the EVM's error semantics are accurately\n  modeled — we just don't claim termination.\n- **`unfold; rfl` depends on reducibility.** Most demo proofs close\n  by `unfold EvmYul.step; rfl`. An upstream `@[irreducible]`\n  annotation on any of `step`, `execBinOp`, `Stack.pop*`, etc. would\n  break every proof simultaneously — at that point, proofs would\n  need to go through named characterization lemmas instead.\n\n## License\n\nThis project is licensed under either of\n\n\u003C!-- markdown-link-check-disable -->\n- [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0) ([`LICENSE-APACHE`](LICENSE-APACHE))\n- [MIT license](https://opensource.org/licenses/MIT) ([`LICENSE-MIT`](LICENSE-MIT))\n\u003C!-- markdown-link-check-enable -->\n\nat your option.\n",1780846763210]