LeanSerde

Type-safe serialization for Lean

Lean version last update license

Key Features

  • Type Safety: Compile-time guarantees with automatic deriving for custom types
  • Deduplication: Efficient serialization of shared structures
  • Multiple Formats: Serialize to ByteArray (CBOR), Lean.Json, or String

Usage

Add LeanSerde as a dependency to your lakefile.toml:

[[require]]
name = "LeanSerde"
git = "https://github.com/oOo0oOo/LeanSerde.git"
rev = "main"

Use it as follows:

import LeanSerde
import LeanSerde.SnapshotTypes

-- Custom types by deriving `LeanSerde.Serializable`
structure FileNode where
  name : String
  children : Array FileNode  -- Recursive!
  validated : Option Bool
  deriving LeanSerde.Serializable, BEq, Inhabited  -- Inhabited is only needed for describeFormat()

def main : IO Unit := do
  -- Create a recursive file structure
  let file1 : FileNode := { name := "file1.txt", children := #[], validated := some false }
  let file2 : FileNode := { name := "file2.bin", children := #[file1], validated := true }
  let file3 : FileNode := { name := "file3.txt", children := #[file1, file1], validated := some true }

  -- Serialize to different formats
  let bytes : ByteArray := ← LeanSerde.serialize file1 -- Binary format (CBOR)
  let _json : Lean.Json := ← LeanSerde.serialize file2
  let _string : String := ← LeanSerde.serialize file3

  match (← LeanSerde.deserialize bytes) with
  | .ok (node: FileNode) =>
    if node == file1 then
      IO.println "Roundtrip successful!"
    else
      IO.println "Roundtrip failed!"
  | .error msg => IO.println s!"Error: {msg}"

  -- Serialize directly to/from file
  LeanSerde.serializeToFile file3 "serialized.cbor"
  LeanSerde.serializeToJsonFile file3 "serialized.json"

  match (← LeanSerde.deserializeFromFile "serialized.cbor") with
  | .ok node =>
    if node == file3 then
      IO.println "File roundtrip successful!"
  | .error msg => IO.println s!"Error loading file: {msg}"

  -- Supports variety of types
  let _ : ByteArray := ← LeanSerde.serialize [3, 1, 4]
  let _ : ByteArray := ← LeanSerde.serialize #[1.1, 2.2, 3.2]
  let _ : ByteArray := ← LeanSerde.serialize (Sum.inl 42 : Sum Nat String)
  let _ : ByteArray := ← LeanSerde.serialize (.ok "success" : Except String String)
  let _ : ByteArray := ← LeanSerde.serialize (("key", 123), ("value", 456))
  let _ : ByteArray := ← LeanSerde.serialize [true, false, true]
  let _ : ByteArray := ← LeanSerde.serialize [[1, 2], [3, 4], []]
  let _ : ByteArray := ← LeanSerde.serialize (System.FilePath.mk "/tmp/test.txt")
  let _ : ByteArray := ← LeanSerde.serialize (Std.Time.PlainDateTime.ofDaysSinceUNIXEpoch 10000, 0, 0, 0⟩)
  let _ : ByteArray := ← LeanSerde.serialize (some (some (some (42 : Nat))))

  -- Describe the serialization format
  match ← LeanSerde.describeFormat FileNode with
  | .ok json => IO.println s!"Format description: {json}"
  | .error msg => IO.println s!"Error describing format: {msg}"

  -- Serializable snapshots of Lean elaboration state
  let s ← LeanSerde.LeanSnapshot.create ["Init"]

  let s' ← s.command "theorem test : 1 + 1 = 2 := by sorry"
  IO.println s!"Goals: {← s'.goals}"

  let s'' ← s'.tactic "rfl"

  let serialized: ByteArray := ← LeanSerde.serialize s''
  let s''' ← LeanSerde.deserialize serialized
  match s''' with
  | .ok (s''': LeanSerde.LeanSnapshot) =>
    IO.println s!"Complete: {s'''.complete?}"
  | .error msg => IO.println s!"Deserialization failed: {msg}"

Supported Types

Type NameStatus
Nat, UInt8, UInt16, UInt32, UInt64ok
Int, Int8, Int16, Int32, Int64ok
Floatok
String, Char, Bool, Unitok
FilePath, Substring, String.Posok
Option, Exceptok
List, Array, Subarray, ByteArray, Finok
Product (α × β), Sum (α ⊕ β), Thunkok
HashMap, HashSet, RBMap, RBTreeok
PersistentHashMap, PersistentArrayok
Std.Format.FlattenBehaviorok
Lean.Jsonok
Lean.Positionok
Std.Time.Year.Era, Year.Offsetok
Std.Time.Month.Ordinal, Week.Ordinalok
Std.Time.Weekday, Day.Ordinalok
Std.Time.Hour.Ordinal, Minute.Ordinalok
Std.Time.Second.Ordinal, Millisecond.Ordinal, Nanosecond.Ordinalok
Std.Time.Timestampok
Std.Time.PlainDate, PlainTime, PlainDateTimeok
Std.Time.TimeZone, TimeZone.ZoneRulesok
Std.Time.ZonedDateTimeok

Meta Types

Use import LeanSerde.MetaTypes to access meta types:

Type NameStatus
Lean.Name, LevelMVarId, Levelok
Lean.Syntax.Preresolved, Syntaxok
Lean.FVarId, MVarId, BinderInfo, Literal, DataValueok
Lean.KVMap, Optionsok
Lean.Exprok
Lean.LocalDeclKind, LocalDecl, LocalContext, MetavarKind, LocalInstance, MetavarDecl, DelayedMetavarAssignment, MetavarContextok
Lean.ConstantVal, AxiomVal, ReducibilityHints, DefinitionSafety, TheoremVal, OpaqueVal, QuotKind, QuotVal, ConstructorVal, InductiveVal, DefinitionVal, RecursorRule, RecursorVal, ConstantInfook
Lean.Import, ModuleIdx, CompactedRegionok
Lean.ModuleData, EnvironmentHeader, EnvironmentData, Environmentpartial
Lean.NameSet, NameMap String, SMapok
Kernel.Diagnosticsok
StateM σ α, Dynamicplaceholder
DeclarationRange, DeclarationLocation, MacroExpansionInfo, FieldInfo, OptionInfo, FieldRedeclInfo, FVarAliasInfo, ElabInfo, ChoiceInfo, PartialTermInfo, TermInfo, CompletionInfo, DelabTermInfo, TacticInfo, CommandInfo, NameGenerator, OpenDecl, FileMapok
Lean.CommandContextInfo, PartialContextInfopartial
Lean.Widget.WidgetInstance, Elab.UserWidgetInfo, Elab.CustomInfo, Lean.Elab.Infopartial
Lean.Elab.InfoTreepartial
TSyntax ks, TSyntaxArray ksok
Lean.FVarIdMap α, MVarIdMap α, FVarIdSet, MVarIdSetok

Legend:

  • ok: Type-safe and serialization/deserialization feasible.
  • partial: While type-safe, some field values or children are not serialized/deserialized.
  • placeholder: Only type info or dummy value is encoded, not real data.

LeanSnapshot

Serializable snapshots of Lean elaboration state for proof checkpointing, interactive theorem proving, and distributed search.

It is currently experimental and does not support all Lean features.

import LeanSerde.SnapshotTypes

-- Create using .create [List String], .empty or .fromEnv Environment
let s ← LeanSerde.LeanSnapshot.create ["Init"]

let s' ← s.command "theorem test : 1 + 1 = 2 := by sorry"
let s'' ← s'.tactic "rfl"

let goals ← s'.goals  -- ["⊢ 1 + 1 = 2"]
let complete := s''.complete?  -- true

Environment and LeanSnapshot require supportInterpreter = true in lakefile.toml:

[[lean_exe]]
name = "main"
root = "Main"
supportInterpreter = true

Note: LeanSnapshot currently only supports tactic sorries, no term sorries.

Custom Types without deriving

If deriving fails, you can manually implement Serializable for your types:

namespace LeanSerde

structure MyArray (α : Type) where
  elems : Array α

instance [Serializable α] : Serializable (MyArray α) where
  encode xs := do
    let encodedElems ← xs.elems.mapM encode
    return .compound "MyArray" encodedElems
  decode sv := do
    let args ← decodeCompound "MyArray" sv
    let decoded ← args.mapM decode
    return ⟨decoded⟩

Serialization Format

LeanSerde automatically chooses between direct serialization or deduplication based on repeated values.

SerialValue Types

  • str: String values
  • nat: Natural numbers
  • bool: Boolean values
  • compound: Named constructors with child values (e.g., ["FileNode", [name, children, validated]])
  • ref: References to shared objects by consecutive ID starting from 0 (e.g., {"ref": 0})

Formats

Simple Format: Direct serialization without repeating values.

Graph Format: Deduplicating format when values are shared. Creates a GraphData structure with:

  • root: The main SerialValue with refs to shared objects
  • objects: Array of shared SerialValues

This intermediate representation then gets encoded to the final output format (JSON, CBOR, String).

Deserialization automatically handles all output formats.

Examples

let file1 : FileNode := { name := "file1.txt", children := #[], validated := some false }
let file2 : FileNode := { name := "file2.bin", children := #[file1], validated := some true }
let file3 : FileNode := { name := "file3.txt", children := #[file1, file1], validated := some false }

Simple format (no sharing):

// file1
["FileNode", ["file1.txt", ["Array", []], ["Option.some", [false]]]]

// file2
["FileNode",
 ["file2.bin",
  ["Array",
   [["FileNode", ["file1.txt", ["Array", []], ["Option.some", [false]]]]]],
  ["Option.some", [true]]]]

Graph format (file1 appears twice in file3):

// file3
{"root":
 ["FileNode",
  ["file3.txt", ["Array", [{"ref": 0}, {"ref": 0}]], ["Option.some", [true]]]],
 "objects":
 [["FileNode", ["file1.txt", ["Array", []], ["Option.some", [false]]]]]}

Automatic Format Description

LeanSerde provides a describeFormat function to get a human-readable description of the serialization format for any type. This is severely limited and does not include all details. Only types with Inhabited and Serializable instances can be described.

match LeanSerde.describeFormat FileNode with
  | .ok json => IO.println s!"Format description: {json}"
  | .error msg => IO.println s!"Error describing format: {msg}"
-- Format description: ["FileNode", ["<String>", ["Array", ["<?>"]], ["Option.none", ["<?>"]]]]

Note that this is not a real namespace. You will have to call LeanSerde.describeFormat even if you open LeanSerde.

Testing

Clone the repository and run:

lake test

License

MIT