Table of Contents
What It Is
TLDR: Understand how LLMs really work by building one from scratch based on nanochat in 3 weeks - in a small group of technical founders, guided by an ML expert.
Nanochat, created by Andrej Karpathy, is an educational LLM designed to teach the fundamentals by doing. In this sprint, you’ll build and train your own GPT-2–level model from scratch, with live guidance from an ML expert at every step.
You’ll work through the full stack of LLM development - from data preparation and tokenization to pretraining, fine-tuning, evals, inference, and chat UI - while learning the theory behind each stage. By the end, you won’t just have run an LLM pipeline; you’ll understand it from first principles.
Who It's For
Technical Founders / operators who want to:
- Understand the foundation of LLMs, not just theory
- Get a taste of the core machine learning concepts needed to train a model
- Hack with peers on Andrej Karpathy’s capstone project
- Commit ~5 hrs/week for 3 weeks
What You'll Walk Away With
By the end of the sprint, participants will:
- Deepen understanding of LLM foundations by building one from scratch
- Explain and rerun every stage of a modern LLM training stack: data preparation → tokenization → pretraining → fine-tuning → inference
- Understand the real constraints data, hardware, cost, model parameters and optimization tradeoffs
- Understand how training works in detail
- Make informed decisions about when to fine-tune, use a RAG or both
- Deploy a working chat UI connected to weights you trained
Instructor
From Data Founders interview of Fabian in 2020
Fabian Blaicher-Brown
- Background in Computer Science, Robotics, Information Systems, and Computer Vision Research.
- Co-founder and CTO of AI startup Shipamax (YC W17, acquired by WiseTech Global).
- Most recently Global Head of all Data Science, AI and ML and leading team of 70+ at WiseTech.
Why Nanochat?
We chose nanochat as the artifact to build as:
- Who does not love Andrej Karpathy’s content?
- It provides a complete pipeline to train a small, functional ChatGPT-style model – covering tokenization, pretraining, fine-tuning, and evaluation - on GPUs in the cloud (~$100 in compute)
Key Aspects of Nanochat:
- Purpose: Acts as a "recipe" for building and training your own LLM, offering a hands-on, educational approach to understand the full stack, including architecture and cost.
- Scope: It covers the entire training lifecycle: tokenization, pretraining on data, supervised fine-tuning (SFT), evaluation on benchmarks, and an inference engine with a UI.
- Capabilities: It enables training a GPT-2 style model (around 26 layers) in roughly 3 hours on an 8xH100 GPU node, making it an affordable way to understand LLM training.
- Repo
- Docs
Format
- 3 weeks
- 3 live lectures (1 hour each)
- 3 office hours (1 hour each)
- Cohort size: 20-30 participants
- Async homework between calls: assignments and voluntarily reading
- WhatsApp chat for Q&A and troubleshooting
Syllabus
Week 1: Getting Started with Nanochat, from data preparation and hardware (~5hrs)
Week 2: The Meat – Let's Train Our Model (~5hrs)
Week 3: Finalizing – Inference (~5hrs)
How to Sign Up
Fill in a quick form and we’ll be in touch in ~24h.