Week 5: Neural Network Foundations

Building blocks of modern deep learning

Time Estimate: 3-4 hours

Topics Covered

Backpropagation and gradient descent variants
Convolutional networks and their applications
Activation functions, normalization, and regularization
Training neural networks from scratch
Common pitfalls and debugging strategies

Featured Speaker

AK

Andrej Karpathy

Co-founder, OpenAI

Learn from industry leaders who are building the future of AI infrastructure and applications.

Video Resources

📹 Video content will be added here by Agent 2

Videos include keynotes, technical talks, and tutorials from industry leaders.

Reading Materials

📚 Reading list will be added here by Agent 3

Research papers, blog posts, and technical documentation.

🛠️ Hands-On Lab

Hardware-Aware Model Design

Advanced 3 hours

Objective

Analyze transformer memory usage, implement memory-efficient attention, and understand hardware-software co-design.

Prerequisites

PyTorch experience
Transformer architecture knowledge
Understanding of GPU memory hierarchy
Python profiling tools

Setup Instructions

Use Google Colab Pro for A100 GPU (recommended)
Install dependencies: pip install torch transformers memory_profiler
Clone repo: git clone https://github.com/stanford-cs153/hw-aware-models

Tasks

Profile memory usage of a transformer forward pass (GPT-2)
Implement tiled attention to reduce memory footprint
Compare inference speed across GPU generations (T4, V100, A100)
Use roofline analysis to identify bottlenecks
Optimize a model layer for your target hardware

Resources