JSON-Schema Lean

Description

An implementation of JSON Schema Draft 7 in Lean (GSoC 2024)

Current Status: 94.9% Draft 7 compliance (241 of 254 test cases passing)

Build

lake build

Docker Build

Build docker image:

docker build -f Dockerfile -t localhost/lean-jsonschema:latest .
# Or with Podman
podman build -f Dockerfile -t localhost/lean-jsonschema:latest .

Testing

There are some tests in lake test, but most tests rely on bowtie:

Install bowtie.

Run tests:

# Run core passing tests
./test.sh

# Test specific keyword
bowtie suite -i localhost/lean-jsonschema:latest \
  https://github.com/json-schema-org/JSON-Schema-Test-Suite/blob/main/tests/draft7/const.json \
  | bowtie summary --show failures

# Run full Draft 7 test suite
bowtie suite -i localhost/lean-jsonschema:latest \
  https://github.com/json-schema-org/JSON-Schema-Test-Suite/tree/main/tests/draft7 \
  | bowtie summary

Design

For integration of this implementation with Bowtie, the project is (currently) divided into two main parts:

  • Main : The entry point of the harness module, which invokes the harness repl.
  • Harness : Command reader and dispatcher, handles stdin and stdout.
  • Implementation : JSON Schema validation implementation in Lean.

Implementation

Project Structure:

├── Dockerfile            # Dockerfile for Bowtie image
├── Harness/              # Bowtie interface
│   ├── Command.lean
│   └── Harness.lean
├── JsonSchema/
│   ├── Compiler.lean     # JSON Schema to Type (TODO)
│   ├── Error.lean        # Error types
│   ├── Format.lean       # Format validators (not yet integrated)
│   ├── Loader.lean       # Remote schema loading (TODO)
│   ├── PointerFragment.lean  # RFC 6901 JSON Pointer navigation
│   ├── Resolving.lean    # $ref/$id resolution and loop detection
│   ├── Schema.lean       # Schema data structures
│   ├── SchemaPointer.lean # Schema pointer utilities
│   └── Validation.lean   # Core validation logic
├── JsonSchemaTesting/    # Compile-time Testing
├── Main.lean             # Entry point for bowtie
├── TestRunner.lean       # Minimal lake test location
├── test.sh               # Test runner script
├── lakefile.toml
└── lean-toolchain
Core Modules

Harness: Handles interaction between validator and Bowtie test harness

  • Implements IHOP protocol (start, dialect, run, stop commands)
  • Reads JSON commands from stdin, dispatches to validator, returns results to stdout

JsonSchema/Schema.lean: Schema data structure definitions

  • Schema: Either Boolean or Object SchemaObject
  • SchemaObject: Contains all JSON Schema keywords
  • JSON parsing and serialization

JsonSchema/Validation.lean: Core validation logic

  • Individual validate* functions for each keyword
  • validateObject: Orchestrates all validations
  • validateWithResolver: Main entry point with fuel-based recursion limiting

JsonSchema/Resolving.lean: Reference resolution

  • Resolver: Registry of schemas by URI
  • Handles $ref, $id, and definitions
  • Loop detection via dependency graph analysis

JsonSchema/PointerFragment.lean: JSON Pointer (RFC 6901)

  • Parses pointer strings like /definitions/foo
  • Navigates schema structures for fragment resolution

Usage Examples

Minimal Example

import JsonSchema.Validation
import Lean

open Lean
open JsonSchema

-- Create a simple schema from JSON
def minimalSchema : Schema :=
  (fromJson? (Json.mkObj [("type", Json.str "string")])).toOption.get!

-- Validate data against the schema
#eval validate minimalSchema (Json.str "hello")  -- Except.ok ()
#eval validate minimalSchema (Json.num 42)       -- Error: wrong type

Example with Resolver and $ref

import JsonSchema.Validation
import JsonSchema.Resolving
import Lean

open Lean
open JsonSchema

-- Schema with definitions and references
def schemaWithRefsJson : Json := Json.mkObj [
  ("$id", Json.str "https://example.com/person.json"),
  ("definitions", Json.mkObj [
    ("address", Json.mkObj [
      ("type", Json.str "object"),
      ("properties", Json.mkObj [
        ("street", Json.mkObj [("type", Json.str "string")]),
        ("city", Json.mkObj [("type", Json.str "string")])
      ]),
      ("required", Json.arr #[Json.str "street", Json.str "city"])
    ])
  ]),
  ("type", Json.str "object"),
  ("properties", Json.mkObj [
    ("name", Json.mkObj [("type", Json.str "string")]),
    ("home", Json.mkObj [("$ref", Json.str "#/definitions/address")]),
    ("work", Json.mkObj [("$ref", Json.str "#/definitions/address")])
  ]),
  ("required", Json.arr #[Json.str "name"])
]

def schemaWithRefs : Schema :=
  (fromJson? schemaWithRefsJson).toOption.get!

-- Create a resolver and register the schema
def resolver : Resolver :=
  Resolver.addSchema {} schemaWithRefs (LeanUri.URI.encode "https" "example.com" "/person.json")

-- Valid person with addresses
def validPersonData : Json := Json.mkObj [
  ("name", Json.str "Alice"),
  ("home", Json.mkObj [
    ("street", Json.str "123 Main St"),
    ("city", Json.str "Springfield")
  ]),
  ("work", Json.mkObj [
    ("street", Json.str "456 Office Blvd"),
    ("city", Json.str "Shelbyville")
  ])
]

-- Validate using the resolver
#eval validateWithResolver resolver default schemaWithRefs validPersonData
-- Success: Except.ok ()

See JsonSchemaTesting/Examples.lean for more examples.

TODO List

  • Containerize (Docker Image)
  • Separated Json Schema validator and Harness module
  • Integrate with Bowtie
    • Basic (run / start / dialect / stop) command dispatcher
    • Read and run tests, return results
  • Add basic supports for JSON-Schema validation
  • Complete keywords (Draft-7)
    • type
    • enum
    • const
    • minLength
    • maxLength
    • pattern
    • minimum
    • maximum
    • exclusiveMinimum
    • exclusiveMaximum
    • required
    • uniqueItems
    • multipleOf
    • allOf
    • anyOf
    • oneOf
    • not
    • items
    • contains
    • maxItems
    • minItems
    • maxProperties
    • minProperties
    • properties
    • patternProperties
    • additionalItems
    • additionalProperties
    • propertyNames
    • dependencies
    • if / then / else
    • id / definitions
  • Load schema (and refs) from a file with file:// URI resolution
  • Remote reference loading (i.e. http, https should load either beforehand or "on-demand")
  • Draft 2019-09 support
  • Draft 2020-12 support
  • Docs

Extra goodies

  • Proofs of termination/correctness
  • Compile JSON Schema to Lean types like datamodel-code-generator
  • Create JSON Schema from Lean types