Why This Book Exists
You've used ChatGPT. But do you really understand how it works?
Most Transformer tutorials are either too academic—piling up formulas without building intuition—or too superficial—teaching you to call APIs without explaining the underlying principles.
This LLM Transformer Book bridges that gap.
What You'll Learn
First Principles
- What is GPT — The history of LLMs and core concepts
- Tokenization — How text becomes numbers
- Embeddings — The geometric meaning of word vectors
- Positional Encoding — Why sequence order matters
Deep Dive into Attention
- Self-Attention — The core innovation of Transformers
- Query, Key, Value — Geometric intuition for attention
- Multi-Head Attention — Why multiple perspectives matter
- Masked Attention — The secret behind autoregressive generation
Complete Architecture
- Layer Normalization — The key to stable training
- Feed-Forward Networks — The underrated component
- Residual Connections — Making depth possible
- Full Forward Pass — From input to output, step by step
Production Optimization
- Flash Attention — IO-aware attention computation
- KV Cache — Core technique for inference speedup
- Quantization — Making models smaller and faster
- Distributed Training — Breaking single-GPU limits
2024-2025 Frontiers
- RLHF — Reinforcement Learning from Human Feedback
- Mixture of Experts — The sparse activation revolution
- Reasoning Models — o1/o3 and DeepSeek R1
- Post-Transformer Architectures — Mamba and State Space Models
What Makes This Book Different
Intuition First, Formulas Second
Every chapter builds intuition before showing equations. Once you have the intuition, formulas are just precise descriptions.
Runnable Code
Hand-written implementations from scratch—not just calling nn.MultiheadAttention. Code you can write yourself is code you truly understand.
Continuously Updated
Covers the latest developments from 2024-2025, including OpenAI o1/o3, DeepSeek R1, and Flash Attention 2/3.
Who Should Read This
- Developers who've used ChatGPT and want to understand the internals
- Anyone who's read Transformer introductions but still feels confused
- Practitioners who want to implement GPT from scratch
- ML engineers who need a quick-reference guide
Start Reading
This book is completely free to read online:
Read the LLM Transformer Book →
This book originated from my Transformer video series on Bilibili, reorganized with additional details, corrections, and coverage of 2024-2025 developments.