Week 6: The Transformer Revolution

Understanding the architecture that powers GPT, BERT, and modern LLMs

Time Estimate: 3-4 hours

Topics Covered

Featured Speaker

AK

Andrej Karpathy

Co-founder, OpenAI

Learn from industry leaders who are building the future of AI infrastructure and applications.

Video Resources

📹 Video content will be added here by Agent 2

Videos include keynotes, technical talks, and tutorials from industry leaders.

Reading Materials

📚 Reading list will be added here by Agent 3

Research papers, blog posts, and technical documentation.

🛠️ Hands-On Lab

Build a Transformer from Scratch

Intermediate 4 hours

Objective

Implement multi-head attention and train a character-level language model, deepening understanding of transformer internals.

Prerequisites

  • PyTorch fundamentals
  • Understanding of attention mechanisms
  • Python 3.8+
  • Linear algebra basics

Setup Instructions

  1. Open Google Colab with GPU runtime
  2. Install dependencies: pip install torch numpy bertviz
  3. Download Shakespeare dataset: wget https://raw.githubusercontent.com/karpathy/char-rnn/master/data/tinyshakespeare/input.txt

Tasks

  1. Implement multi-head self-attention from scratch
  2. Build a complete transformer decoder block
  3. Train on Shakespeare text (character-level)
  4. Visualize attention patterns with BertViz
  5. Generate text samples and analyze quality

Resources