Lean Formalization of Generalization Error Bound by Rademacher Complexity and Dudley's Entropy Integral
Abstract
Understanding and certifying the generalization performance of machine learning algorithms---i.e. obtaining theoretical estimates of the test error from a finite training sample---is a central theme of statistical learning theory. Among the many complexity measures used to derive such guarantees, Rademacher complexity yields sharp, data-dependent bounds that apply well beyond classical
Major updated:
(2026 Jan) We have formalized Dudley's entropy integral bound for Rademacher complexity for the first time.
(2026 Feb) We have formalized Lasso, or
How to Run
- Open a terminal. Run the following commands.
git clone https://github.com/auto-res/lean-rademacher.git cd lean-rademacher # get Mathlib4 cache lake exe cache get - Launch VS code,
- open the folder
lean-rademacher, - select the file
FoML/Main.lean, and - push
Restart Filebutton to rebuild the project.
Contents (selected)
Key theorems (resp. definitions) are gathered in Main.lean (resp. Defs.lean), e.g.
FoML.Main.uniform_deviation_tail_bound_separable- (Main Theorem) Generalization error bound using Rademacher complexity
FoML.Defs.empiricalRademacherComplexityet al.- Definition(s) of Rademacher complexity
FoML.Main.uniform_deviation_mcdiarmid_tail- McDiarmid inequality (for deviations)
FoML.Main.linear_predictor_l2_bound- Example: Generalization error bound for ridge regression (linear regression with
-regularization)
- Example: Generalization error bound for ridge regression (linear regression with
FoML.Main.linear_predictor_l1_bound- Example: Generalization error bound for lasso regression (linear regression with
-regularization)
- Example: Generalization error bound for lasso regression (linear regression with
FoML.Main.dudley_entropy_integral_bound- Dudley's entropy integral bound for Rademacher complexity
Future plans
Contributors are always welcome! (Contact: Discord)
- Examples of generalization error bounds such as
- for RKHS
- Examples of covering numbers
(of a function sets w.r.t. sup-norm or empirical-norm to instantiate Dudley's entropy bound) such as - the unit ball of Lipschitz-continuous functions on a compact set
- neural networks with bounded weights
- the unit ball of Lipschitz-continuous functions on a compact set
- Brushing-up key definitions/inequalies such as Rademacher complexity, Dudley's entropy bound, Azuma-Hoeffding, McDiarmid, ...
Contributors
Kei Tsukamoto, Kazumi Kasaura, Naoto Onda, Yuma Mizuno, Sho Sonoda