Bootstrap knowledge of LLMs ASAP.
Neural network links before starting with transformers.
- https://www.youtube.com/watch?v=aircAruvnKk&list=PLZHQObOWTQDNU6R1_67000Dx_ZCJB-3pi
- https://www.3blue1brown.com/topics/neural-networks
- http://neuralnetworksanddeeplearning.com/
- https://distill.pub/
- Andrej Karpathy The spelled-out intro to language modeling: building makemore: basic. bi-gram name generator model by counting, then by NN. using pytorch.
- Andrej Karpathy Building makemore Part 2: MLP:
- Andrej Karpathy Building makemore Part 3: Activations & Gradients, BatchNorm):
- Andrej Karpathy Building makemore Part 4: Becoming a Backprop Ninja:
- Hedu AI Visual Guide to Transformer Neural Networks - (Episode 1) Position Embeddings: Tokens are embedded into a semantic space. sine/cosine position encoding explained very well.
- Hedu AI Visual Guide to Transformer Neural Networks - (Episode 2) Multi-Head & Self-Attention: Clear overview of multi-head attention.
- Hedu AI Visual Guide to Transformer Neural Networks - (Episode 3) Decoder’s Masked Attention: Further details on the transformer architecture.
- Andrej Karpathy Andrej Karpathy - Let's build GPT: from scratch, in code, spelled out.: build up a Shakespeare gpt-2-like from scratch. starts with bi-gram and adds features one by one. pytorch.
- Chris Olah CS25 I Stanford Seminar - Transformer Circuits, Induction Heads, In-Context Learning: Interpretation. Deep look into the mechanics of induction heads. Companion article
- Jay Alammar The Illustrated Word2vec - A Gentle Intro to Word Embeddings in Machine Learning
- Jay Alammar How GPT3 Works - Easily Explained with Animations: extremely high level basic overview.
- Jay Alammar The Narrated Transformer Language Model : much deeper look at the architecture. goes into detail. Companion article.