Formalization of "Analysis I" by Terence Tao
The goal of this project is to formalize the book "Analysis I" by Terence Tao, including the main text and the exercises. This is a work in progress. The formalization is done in Lean 4. The repository pins a Mathlib version in lakefile.lean to anchor the toolchain, but no module inside Lean4AnalysisTao/ imports Mathlib; the development rebuilds every structure it needs from the Peano axioms upward, so that the Lean proofs can mirror Tao's axiomatic development.
Note. Tao has since started his own Lean companion, which is more advanced. This repo is a personal learning exercise kept as an alternative take, not a competitor, and it diverges in three ways: (1) no Mathlib anywhere, not just the early chapters; (2) the end-of-section exercises are formalized too, not left as
sorry; (3) a strict uniform proof style so the corpus reads as structured data.
Progress
(Updated on 2026-04-14)
Main text
- 2. Starting at the Beginning: The Natural Numbers
- 2.1. The Peano axioms
- 2.2. Addition
- 2.3. Multiplication
- 3. Set Theory
- 3.1. Fundamentals
- 3.2. Russell's Paradox
- 3.3. Functions
- 3.4. Images and Inverse Images
- 3.5. Cartesian Products
- 3.6. Cardinality of Sets
- 4. Integers and Rationals
- 5. The Real Numbers
- 6. Limits of Sequences
- 7. Series
- 8. Infinite Sets
- 9. Continuous Functions on R
- 10. Differentiation of Functions
- 11. The Riemann Integral
- Appendix A: The Basics of Mathematical Logic
- Appendix B: The Decimal System
Solutions to exercises
- 2. Starting at the Beginning: The Natural Numbers
- 2.2. Addition
- 2.3. Multiplication
- 3. Set Theory
- 3.1. Fundamentals
- 3.2. Russell's Paradox
- 3.3. Functions
- 3.4. Images and Inverse Images
- 3.5. Cartesian Products
- 3.6. Cardinality of Sets
- 4. Integers and Rationals
- 5. The Real Numbers
- 6. Limits of Sequences
- 7. Series
- 8. Infinite Sets
- 9. Continuous Functions on R
- 10. Differentiation of Functions
- 11. The Riemann Integral
- Appendix A: The Basics of Mathematical Logic
- Appendix B: The Decimal System
Style
Proofs decompose into a small uniform set of discrete named inference steps, so the corpus reads as structured training data. Five concerns organise the rules: fidelity, explicitness, uniformity, articulation, layout.
Fidelity
- No
import Mathlib.*underLean4AnalysisTao/; re-derive any missing lemma from current-chapter axioms. - Main-text
S*.leanfiles contain only what Tao states, in his order. Scaffolding (bridging lemmas, strengthened axioms, type-hierarchy facts) goes inS*_Extras.leanor the chapter'sUtil.lean. - Main-text proofs mirror Tao's argument. Exercise solutions (
Solutions_S*) are free, but should reach first for earlier in-chapter lemmas. - Prefix every numbered book item with a
-- <Kind> X.Y.Zanchor (-- Definition 3.1.1,-- Lemma 2.2.3,-- Axiom 3.9,-- Proposition 2.2.4,-- Exercise 3.2.2) for cross-reference.
Explicitness
Every binder, value, and identifier is spelled out; nothing is inferred silently.
- Only
{α : Type}/{α : Sort u}stay implicit. All other arguments become explicit(x : T)binders;[Inst]stays in bracket form. - Lift leading
∀and→into binders on every named binding (axiom/theorem/def/lemma/example/instance/have/let):∀ x, P x → Q xbecomes(x : T) (hp : P x) : Q x. Exception: keep∀in the conclusion of higher-order principles whose consumers unify against the∀-shape (induction schemes, choice, pointwise-equality lemmas fed torw/congrArg). - Always annotate
havewith a type:have hfoo : T := …. - No
_in applied or type-hole positions: spellcongrArg P h,Exists.intro a h,Eq.trans h₁ h₂. Reach for@only when unification genuinely can't resolve a type implicit; prefer dropping@and passing arguments positionally._is allowed only to discard names in binding patterns;?_only as arefinehole. - Numeric literals carry type ascriptions (
(3 : MyNat)). - Fully-qualified names only:
Eq.symm h,And.left h,Iff.mp h,Or.elim h f g,Or.inl x,And.intro hp hq,Exists.intro a h,MyFun.eval f x hx. No dot-projection (h.symm,f.eval), no leading-dot constructors (.inl x), no anonymous⟨…⟩in term position — not at the top of a term, not nested as an argument (Or.inr ⟨h1, h2⟩becomesOr.inr (And.intro h1 h2)). Destructor-pattern⟨…⟩inintro/rcases/let/funis fine. - Prefix proof hypotheses with
h:hle,hlt,hpos,hne,hxA,hfdom. Value binders keep their natural name (x,n,A,f). - No
show T from e; use(e : T)ascription.
Uniformity
Each inference move has one canonical form.
- Tactic whitelist. Every
byblock may use only:intro,have,let,exact,refine,rw,rcases,by_contra,by_cases,use,constructor,dsimp only [defs](theonlyform is mandatory),rfl. Everything else is banned, includingsimp,simp only,omega,decide,tauto,linarith,aesop,norm_num,ring,field_simp,push_neg,trivial,assumption,apply,show,exfalso,left,right,cases,obtain,rintro,match,change,subst,suffices,contradiction,ext,funext,unfold,<;>.refinesubsumesapply/left/right/exfalso;rcasessubsumescases/rintro/obtain. - No
▸: userworEq.mpr (congrArg P h) x.▸hides direction. - Macro bodies that implement a whitelisted tactic may use term-position
⟨…⟩andfirstwhere genuinely required for the expansion, but nothing else from the banned list. - Term-mode
match,let,funext,fun, and destructor-pattern⟨…⟩are not tactics and are unrestricted.
Articulation
Every logical joint is named and visible at the top level of its block.
- Name every hypothesis and intermediate fact (
hle,hlt,hpos, …); no anonymousthis. by_contraandby_casesalways name their hypothesis:by_contra hne,by_cases hx : x ∈ A.- No inline
byblocks for proofs and no proof-producingfunin applied position; lift to a namedhave hfoo : T := by …above. Single exception: afunpassed as a non-lifted argument that produces a value, a predicate, or a structure field (e.g.prop := fun x y => by …inside a struct literal, or thepargument tochoose_spec). Thebyblock inside such afuninherits the exception. - One tactic per line, one lemma per tactic:
rw [h1]thenrw [h2]on separate lines, notrw [h1, h2]; same fordsimp only. No;chaining. - Every subgoal opens with
·.
Layout
- Cap line length at ~100 characters; long lines usually mean a step is doing too much.
- Term RHS breaks:
:=ends the line, term on next indented line. Tactic RHS stays inline as:= by, tactics on subsequent lines. - Canonical signature layout for every named binding: name on line 1; one binder group per line indented 4, last binder line ends
:; conclusion on its own line indented 4, then:= byor:=. No inlinename : concl := body. - Canonical
matchlayout (term-mode):matchand its scrutinees on line 1,withat the end of that line (or on its own line if the scrutinee list wraps); each arm on its own line starting with|, indented to match thematchkeyword; arm RHS inline after=>for short terms, or broken to the next indented line for long ones.
License
The project is licensed under the MIT License. See LICENSE for details.
Acknowledgements
- The book itself: Terence Tao, Analysis I.
- The Lean and Mathlib communities for the language, the tooling, and for showing what a modern proof assistant can look like.
- AI assistance from GitHub Copilot, Google Gemini, and Anthropic Claude during formalization.