WEEK 2: AI-Assisted Development & NLP Intro

This week we have just one lecture due to the snow day cancellation on Monday. We'll focus on how to effectively use AI tools for coding, then introduce the foundations of classical NLP.

This week's checklist (due Friday 1/30)

  • (Note: Monday 1/26 class is cancelled due to weather)
  • Attend Discussion Section (Tue, Jan 27): Getting started with Google Colab, GitHub Classroom, and using python for classical NLP
  • Attend Lecture 2 (Wed, Jan 28): AI-assisted development + Classical NLP
  • Complete Week 2 Reflection, pushed to GitHub
  • Complete Lab 1, pushed to GitHub

This week's learning objectives

After Lecture 2 (Wed 1/28) students will be able to...

AI-Assisted Development:

  • Identify appropriate AI coding tools for different development tasks (brainstorming, writing, debugging, understanding)
  • Distinguish between AI coding interfaces (chat, edit mode, agentic) and when to use each
  • Apply best practices for AI-assisted coding (verification, security awareness, understanding before shipping)
  • Recognize common AI coding failures and when to be skeptical

Classical NLP:

  • Explain the classical NLP pipeline: text to numbers to predictions
  • Represent text documents using bag-of-words vectors
  • Identify common preprocessing steps (lowercasing, stop words, stemming, etc.)
  • Implement n-gram models for simple text generation
  • Recognize the limitations of counting-based approaches (no context, no word meaning)

Discussion Section (Tue 1/27): Getting Started and Classic NLP

Note: This week's discussion happens before the lecture, so we'll use it as hands-on exploration rather than reinforcement.

Please bring your laptop to discussion! You will be coding during the class.

What you'll do:

  1. Learn about Google Colab and instructions for set-up
  2. Briefly review git and GitHub, troubleshoot any issues that came up with Lab 0
  3. Start building on a template repo using a bag-of-words and TF-IDF approach to solve a text classification problem.

Week 2 Reflection Prompts

Write 300-500 words reflecting on this week's content, or the area in general. Some prompts to consider:

  • What has your experience been using AI tools for coding so far? What works well? What doesn't?
  • After learning about bag-of-words and n-grams, what surprised you about these simple approaches? What can they do well?
  • How do you think about the tradeoff between using AI tools to move fast vs. understanding what the code does?
  • What questions do you have about AI-assisted development and classic NLP that we didn't cover?

Remember to write in your own voice, without AI assistance. These reflections are graded on completion only and help me understand what's working for you.

Lab 1: Text Processing Basics

Due: Friday, Jan 30 by 11:59pm

Suggested explorations

  • Build upon the bag-of-words and TF-IDF work you began during discussion - what can you do to make the model better? Explore the impact on the size of the vocabulary, data cleaning decisions, the size of the training set, or the type of classifier model used.
  • Experiment with n-gram text generation. Try 3-grams, 4-grams... Is there a relationship between input datasets size and the ideal n-gram length? Can you formulate a way to think about using a variable number of n-grams (sometimes 1-grams, sometimes 2-grams depending on the word or word pair? what kinds of pairs are important to preserve?)
  • Find an interesting dataset to try these techniques on. Can you predict amazon product star ratings from the review text? Can you generate poetry with a certain structure or jokes with n-grams and a little cleverness?

Resources for further learning

On AI coding tools

Videos

Tutorials