WEEK 8: The LLM Landscape and Fine-Tuning Strategies

Welcome back from spring break! This week we zoom out to survey the model landscape and then zoom back in to ask: once you have a model, how do you adapt it? Monday covers the ecosystem of available LLMs - how to read model cards, compare open vs. closed models, and pick the right tool for a task. Wednesday gets practical with fine-tuning: the adaptation spectrum from simple prompting all the way to full fine-tuning, with a focus on parameter-efficient methods like LoRA that make fine-tuning accessible.

This week's checklist

  • Attend your oral exam time, if applicable
  • Attend Lecture 11 (Mon, Mar 16): The LLM Landscape
  • Attend discussion section (Tue, Mar 17): Model selection and fine-tuning strategy design
  • Attend Lecture 12 (Wed, Mar 18): Fine-Tuning Strategies
  • Submit Week 8 Lab (due Sun, Mar 22 by 11:59pm)

This week's learning objectives

After Lecture 11 (Mon Mar 16) students will be able to...

  • Navigate the major model families: GPT series, Claude, Gemini, Llama, Mistral, Falcon
  • Compare proprietary and open-source models: cost, capability, customization, privacy
  • Read and interpret model cards: what information should a model provide, and what's missing?
  • Make informed model selection decisions for specific use cases
  • Explain the foundation model paradigm: pre-train once, adapt for many tasks

After Lecture 12 (Wed Mar 18) students will be able to...

  • Navigate the adaptation spectrum: API calls, prompting, PEFT, full fine-tuning, pre-training from scratch
  • Explain why full fine-tuning can be expensive and impractical at scale
  • Describe LoRA (Low-Rank Adaptation): freeze the base model, train small adapter matrices
  • Identify when to use LoRA vs. full fine-tuning vs. just prompting
  • Explain catastrophic forgetting and how training choices can prevent it
  • Recognize that fine-tuning can degrade safety training, and why that matters

Discussion Section (Tue Mar 17): Loading and Fine-Tuning Open Models

Hands-on practice with open-source model weights in Python.

Activities:

  1. Load a model from HuggingFace: Use transformers to download and run a small open model (e.g., Llama 3.2 1B or Qwen2.5-1.5B). Run a few generations and inspect the output.
  2. Fine-tune on a small dataset: Use the transformers Trainer or a PEFT/LoRA setup to fine-tune on a toy dataset. Observe how loss changes and compare outputs before and after.
  3. Discuss tradeoffs: What did you need to get this running? What would break at larger scale? What would you do differently for a real project?

Week 8 Lab: Exploring the LLM Landscape

Due: Sunday, March 22 by 11:59pm

What you'll do:

  • Choose a task (e.g., summarization, code generation, question answering)
  • Run the same prompts through 2-3 different models (mix open and proprietary if possible)
  • Document the differences: quality, style, refusals, speed, cost per token
  • Read and evaluate one model card critically: what's documented well? What's missing?
  • Reflection: Based on your experiments, which model would you use for a real project, and why?

Deliverable: Push your notebook to GitHub (fully merged) and submit your repo link on Gradescope.

Note: Use the free tiers of APIs (OpenAI, Anthropic, together.ai, or HuggingFace Inference API) and small open models to keep costs low.

Resources for further learning

LLM landscape

Fine-tuning and PEFT

Staying current:

Frameworks: