# Dashed Monoids and Categorical Groups

A Lean4 Formalization

Welcome!! This repository contains a Lean4 formalization of concepts from my PhD thesis on the Symmetrization of Categorical Groups, focusing on the categorical analogue of group abelianization.

The project is written to be accessible to both mathematicians and computer scientists, especially those interested in rewriting systems, algebraic structures, and formal verification.

For questions or feedback, feel free to reach out at parab.7@osu.edu.


Project Overview

A Game of Words, Parentheses, and Cancellation

Think of the following as a small rewriting game.

You start with a fully parenthesized word, for example:

B(A((N A)(N A)))

You are allowed to perform the following rewrite moves.


Associativity: Re-bracketing the Word

You may reassociate parentheses using the rule:

(XY)Z   ↔   X(YZ)

where X, Y, Z can be:

  • a single letter, or
  • a fully parenthesized subexpression.
Examples
((B A) N) (A (N A))   ↔   (B (A N)) (A (N A))
((B A) N) (A (N A))   ↔   (B A) (N (A (N A)))
((B A) N) (A (N A))   ↔   ((B A) N) ((A N) A)

The goal is to understand when two words are reachable via a sequence of re-write moves. In the setting of monoidal categories (the moves are labled and satisfy coherence axioms); are any two composition of sequences between two words equal?

A classical result of Mac Lane shows that as long as the underlying word is the same (for example BANANA), any bracketing can be transformed into any other. In the setting of monoidal categories, any two such transformation sequences are equal: a coherence theorem.


Enter Categorical Groups: Adding Dashes

In categorical groups, we play the same game; but now letters and parenthesized blocks may carry dashes, representing formal inverses:

B''(A((N A)'(N A)))

In addition to reassociating parentheses, we now allow cancellation.


Cancellation Rules

For any expression X (a letter or a fully parenthesized block), we may cancel:

X' X   ↔   ∅
X X'   ↔   ∅

where ∅ denotes the empty word.

These rules encode group-like behavior inside the rewriting system.


The Core Questions

With associativity and cancellation in place, we ask:

  1. Reachability
    When can one dashed word be transformed into another using the rules?

  2. Coherence
    In the contex of categorical groups (the moves have labels and satisfy the coherence axioms) whether the multiple valid rewrite sequences between two words equal?

Examples
B''(A((N A)'(N A)))  →  B''A
B'' A                →  B'' ((B' B) A)
B'' ((B' B) A)       →  (B'' (B' B)) A
(B'' (B' B)) A       →  ((B'' B') B) A
((B'' B') B) A       →  B A 

In my thesis, I give a precise criterion characterizing when one dashed word can be transformed into another. Moreover, within the axioms of categorical groups, I prove a coherence theorem: any two valid rewrite sequences are equal.


Dashed Monoids (DMon S)

I construct the free dashed monoids (monoids with a unary operation) to model the above game. The free dashed monoid is precisely the set of all possible dashed words with using the given set of letters. The multiplication is given by concatation and the unary operation is simply adding an outer dash. To do any meaningful mathematics with this construction, we need to show that the free dashed monoids satisfy the appropriate universal property. In my thesis, I give the precise recipe to achieve this and I also provide the formal proof of universal property in Lean4.


Formal Structure in Lean4

Dashed words generated by a given set of letters (S) are defined as an inductive type in Lean4.
The construction preserves associativity while introducing a dash operation compatible with cancellation.

The definition uses the following constructors:

  1. Empty list
  2. Inclusion of an element from S
  3. Multiplication of two or more elements from S
  4. Dashing of an element from S
  5. Dashing of composite expressions
  6. Multiplication of dashed expressions

This design mirrors the dashed monoid basis from the mathematical development and allows precise control over rewriting and equivalence.


Code Structure

  • Dashed_Lists/as_inductive.lean
    Introduces the algebraic structure of dashed monoids and provides illustrative examples.

  • FDMon.Definition
    Defines dashed lists formally as FDMon.

  • FDMon_is_FreeDMon
    Proves that this construction satisfies the universal property of a free dashed monoid.

  • DashedMonoids.pdf
    Contains the underlying mathematical development.
    The Lean code closely follows Definition 3.2.6 and related results.


Status

  • Core inductive structure formalized
  • Universal property proved in Lean
  • Equivalence criterion formalized
  • Full coherence proof in progress

Why This Matters

This project sits at the intersection of:

  • algebraic rewriting systems
  • categorical coherence
  • formal verification with proof assistants

The techniques developed here are relevant to any setting where correctness depends on identifying when different rewrite paths are semantically the same.

See Appendix A for a quick introduction to Lean4.

  1. Formalizing Semi-Strict and Strict Categorical Groups

The second part of this project delves into categorical algebra. Specifically, it involves:

Defining semi-strict categorical groups and strict categorical groups in Lean4 and proving their categorical equivalence. This is a work in progress, more details will follow.

Appendix A: What is Lean4

Lean is a dependent-typed programming language and proof assistant. It’s designed for writing mathematical proofs and formally verifying them with the help of type theory. Think of it as a tool where programming meets logic and mathematics.

If you’re familiar with languages like Python, Java, or C++, Lean might feel different because:

  1. It’s rooted in Type Theory, not object-oriented. In Lean, a type can represent a mathematical proposition, and its elements are proofs of that proposition.
  2. The focus is on correctness and proof rather than just running code. Writing a function in Lean often corresponds to proving something.

Reading Lean Code

Lean code has a declarative style, often involving theorems, definitions, and tactics:

Definitions (def): Define functions or values, like in most programming languages. Theorems/Propositions (theorem or lemma): Statements you want to prove. Tactics : Step-by-step tools to guide Lean in constructing a proof.

Example

The main result we want to prove is as follows: The dashed lists that we constructed is indeed the data structure we wish to create. FDMon S is the name given to dashed lists and the properties that we want it to satisfy are stored at FreeDMon. Thus, we have following theorem.

theorem FDMon_is_FreeDMon {S:Type _}:FreeDMon (incS:(S→ FDMon S)):= by
constructor
case exist=>
    intro N stN f
    use FDMon.induce f
    ---

case unique=>
    intro N stN f g hf hg hyp
    ext x
    apply induce_unique f g hf hg
    intro a
    rw[← @Function.comp_apply (FDMon S) N S f incS a]
    rw[hyp]
    simp only [Function.comp_apply]

In above, everything after by is the proof of the theorem. Constructor divides the proof into two cases: Exists and Unique. We use tactics given to us and developed over the course to prove these cases. If Lean does not give any error message then it means that the proof is correct.