Lesson 1 — Random Forests
Set up Jupyter + the fastai library, load the Blue Book for Bulldozers Kaggle dataset, and train your first random forest. The first taste of a working ML pipeline.
Lesson 2 — Random Forest Deep Dive
How a random forest actually works under the hood, plus the validation strategies that prevent you from fooling yourself.
Lesson 3 — Performance, Validation, and Model Interpretation
How to read what the model is telling you about your data — feature importance, partial dependence, the things that matter once a model works.
Lesson 4 — Feature Importance, Tree Interpreter
Going deeper into model interpretation. Confidence intervals on predictions. The tree interpreter for explaining individual rows.
Lesson 5 — Extrapolation and RF from Scratch
Where random forests fail (extrapolation), and how to build one yourself from numpy primitives so you understand it in the bones.
Lesson 6 — Data Products and Live Coding
From a model in a notebook to a data product in production. Live coding with the Rossmann dataset.
Lesson 7 — RF From Scratch + Gradient Descent
Finishing the from-scratch random forest, then pivoting to the engine behind deep learning: gradient descent.
Lesson 8 — Gradient Descent and Logistic Regression
Logistic regression as a one-layer neural net. SGD, learning rates, the actual mechanics of training.
Lesson 9 — Regularization, Learning Rates, and NLP
Weight decay, the magic of finding a good learning rate, and the leap into natural-language processing.
Lesson 10 — More NLP, Columnar Data
Continuing the NLP thread alongside techniques for columnar data — the bread and butter of most real ML work.
Lesson 11 — Embeddings
What embeddings actually are, how they're trained, and why they're the single most important idea in modern ML.
Lesson 12 — Complete Rossmann, Ethical Issues
Putting the full Rossmann competition pipeline together, then closing on ethics — bias, fairness, what models can quietly do to the world.