Week 2: Silicon Architecture – GPUs, TPUs, and Custom Chips

Understanding the hardware that powers AI at scale

Time Estimate: 3-4 hours

Topics Covered

Featured Speaker

JH

Jensen Huang

CEO & Founder, NVIDIA

Learn from industry leaders who are building the future of AI infrastructure and applications.

Video Resources

📹 Video content will be added here by Agent 2

Videos include keynotes, technical talks, and tutorials from industry leaders.

Reading Materials

📚 Reading list will be added here by Agent 3

Research papers, blog posts, and technical documentation.

🛠️ Hands-On Lab

Design a GPU Cluster Network

Intermediate 3 hours

Objective

Design a Clos network topology for a 256-GPU cluster and understand datacenter networking fundamentals.

Prerequisites

  • Basic networking knowledge (IP, bandwidth)
  • Understanding of network topologies
  • Spreadsheet or diagramming tool

Setup Instructions

  1. Install draw.io or use Lucidchart
  2. Review Clos network topology basics
  3. Open the provided network calculator spreadsheet

Tasks

  1. Design a 3-tier Clos network for 256 GPUs with 8 GPUs per server
  2. Calculate bisection bandwidth requirements
  3. Compare InfiniBand (400 Gbps) vs RoCE (100 Gbps) costs
  4. Estimate switch count and cabling requirements
  5. Calculate total network cost and identify bottlenecks

Resources