WEEK 3: Deep Learning Fundamentals & Tokenization

This week we dive into the foundations of modern NLP. On Monday we'll explore how neural networks learn through backpropagation and gradient descent. Tuesday's discussion gets you hands-on with PyTorch. Then Wednesday we'll see how text gets split into tokens - a critical step that affects everything downstream.

This week's checklist (due Friday 2/6)

Attend Lecture 3 (Mon, Feb 3): Deep learning fundamentals
Attend Discussion Section (Tue, Feb 4): PyTorch hands-on
Attend Lecture 4 (Wed, Feb 5): Tokenization
Complete Week 3 Reflection and Lab, pushed to GitHub

This week's learning objectives

After Lecture 3 (Mon 2/3) students will be able to...

Neural Networks:

Explain how neural networks transform inputs through layers of weighted sums and activations
Understand backpropagation as efficient application of the chain rule
Describe gradient descent and how it minimizes loss functions
Recognize why depth matters: hierarchical feature learning
Identify the sequence modeling challenge for feed-forward networks
Discuss the computational and environmental costs of training large models

After Lecture 4 (Wed 2/5) students will be able to...

Tokenization:

Explain why tokenization choices affect model behavior (e.g., why LLMs struggle to count letters)
Describe historical approaches: word-level, stemming, lemmatization
Explain how subword tokenization (BPE, WordPiece) handles vocabulary challenges
Understand the role of special tokens in chat models (system, user, assistant)
Use tokenizer tools to see how models "see" text
Discuss fairness implications of tokenization across languages
Preview: understand that tokens become embeddings (vectors that capture meaning)

Discussion Section (Tue 2/4): PyTorch Hands-On

Please bring your laptop to discussion! You will be coding during the class.

What you'll do:

Get PyTorch installed and running (if not already)
Build a simple neural network from scratch
Train it on a toy task (e.g. simple classification)
Experiment with different architectures: more layers, different activations
See backpropagation in action with loss.backward()

Week 3 Reflection Prompts

Write 300-500 words reflecting on this week's content, or the area in general. Some prompts to consider:

If you are new to deep learning, what clicked for you, and what questions do you still have? If you've studied the subject before, did you learn something or gain a new perspective?
After learning about tokenization, were you surprised by how LLMs "see" text? What implications does this have?
What do you think about the tokenization fairness discussion? Should companies address language efficiency differences? How?
What connections do you see between tokenization choices and model capabilities?
We discussed the environmental and financial costs of training large models. Who should bear these costs? Should there be regulations?

Remember to write in your own voice, without AI assistance. These reflections are graded on completion only and help me understand what's working for you.

Lab 2: Neural Networks and/or Tokenization Exploration

Due: Friday, Feb 6 by 11:59pm

Choose your focus (or do both, or something else!):

Option A: Neural Network Exploration

Build a simple neural network in PyTorch from scratch
Train it on a task (XOR, MNIST digits, simple classification)
Experiment: What happens as you add layers? Change activation functions? Adjust learning rate?
Visualize the loss curve - can you see gradient descent working?

Option B: Tokenization Exploration

Experiment with different tokenizers (OpenAI, Claude, tiktoken)
Compare token counts: code vs prose, English vs other languages, emojis
Investigate the "strawberry" problem - why can't LLMs count letters?
Explore fairness: same content in different languages, how do token counts differ?

Option C: Connect the Two

Tokenize some text, convert to simple numerical representations
Feed through a neural network for a simple task
See the full pipeline: text, tokens, numbers, neural network, predictions

Resources for further learning

On neural networks

3Blue1Brown - Neural Networks series - Beautiful visualizations of backprop (Chapters 1-4)
TensorFlow Playground - Visualize neural networks learning in real-time (we'll use this in class!)
StatQuest - Backpropagation - Clear step-by-step walkthrough
Loss Landscapes - Interactive visualizations of neural network loss surfaces

On tokenization

OpenAI Tokenizer - See how GPT tokenizes text
Claude Tokenizer - Compare with Claude's tokenization
Let's build the GPT Tokenizer - Andrej Karpathy's walkthrough of BPE

Tutorials

PyTorch tutorials - Official PyTorch docs
Michael Nielsen: Neural Networks and Deep Learning - Free online book, very accessible

Lauren's CDS593 Materials