Tokenization — The First Gradient Descent Step
Tokenization is the invisible foundation of every LLM. This post traces how it evolved from simple word splitting to BPE, WordPiece, Unigram, and SentencePiece — and why the choice of tokenizer shapes everything downstream.

