Lecture 2 - AI-Assisted Development & Classical NLP

Welcome back!

Last time: We explored what LLMs are, their history, how we're going to work together this term

Today:

AI-assisted development
A bit of classical NLP (BoW and ngrams)

Highlights from the syllabus activity and other logistics

The "cite your friends" question

Concerns:

Lack of deep learning background
Strict late policy
Exams (percentage, no notes)
Grading fairly given tool access, collaborators
Project open-endedness (choosing one, grading given that)
Lots of deliverables
"Ethical concern"?

You liked:

Project-based structure
Clear expectations
No blackboard?
AI use allowed
No final
Coffee chats!

Questions

Time commitment
iPads / how to take notes
How labs and portfolio pieces work
Suggestions for books / other resources
Forming teams

Both liked and disliked:

No laptop policy
Oral exam redo option

Logistics:

Swapping L3 and L4
Renaming / numbering labs and reflections (see schedule)

How to Report a Problem (Life Skill!)

When you message us (or a future coworker/manager) about a technical issue, include:

1. What you did - Be specific!

What tool/command/interface?
What did you click or type?
Any other context (network connections, previous actions)

2. What you expected - What should have happened?

3. What actually happened - Error messages, screenshots, exact text

4. What you've tried yourself - Steps you've taken to debug that have failed

Bad: "Torch isn't working for me"

Good: "I ran python train.py in VS Code's terminal on my Mac. I expected it to start training, but instead I got:

Traceback (most recent call last):
  File "train.py", line 1, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'

I installed PyTorch yesterday using pip install torch. I'm using Python 3.11 and I'm not using a virtual environment. When I run which python I get /usr/bin/python3. I tried running pip install torch again and it says 'Requirement already satisfied.' I also tried pip3 install torch with the same result."

Bad: "I can't push to GitHub"

Good: "I clicked 'Push' in GitHub Desktop last night. I expected my commits to appear on github.com, but instead I got:

Updates were rejected because the remote contains work that you do not have locally.

I'm the only one working on this repo and I haven't made changes from another computer. When I run git status I see Your branch and 'origin/main' have diverged, and have 1 and 1 different commits each. I'm not sure how the remote got a different commit since I haven't pushed from anywhere else."

Ice-breaker

Question: What's one thing you used an AI tool for in the last week?

Share with a neighbor, then we'll hear a few examples.

Part 1: AI-Assisted Development

How AI Can Help You Code

AI tools can assist at many different stages of development:

Brainstorming and planning

"What's a good architecture for a web scraper?"
"What libraries should I use for text processing in Python?"

Writing code

Autocomplete, generating functions, boilerplate

Debugging and fixing errors

"Why am I getting this error?" with the stack trace

Understanding unfamiliar code

"Explain what this function does" when joining a new project

Writing tests and documentation

"Write unit tests for this function"
"Add docstrings to these methods"

The Tools Landscape

There are two things to understand: the interface (how you interact) and the model (the AI doing the work).

Interfaces / IDEs:

Cursor - AI-native IDE (fork of VS Code), $20/month or free tier
VS Code + Extensions - Claude extension, GitHub Copilot extension
Chat interfaces - ChatGPT, Claude.ai, Gemini

Underlying Models:

Anthropic's Claude 4.5 - Opus, Sonnet, Haiku
OpenAI's GPT-5.2 (Thinking/Pro/Instant/Codex)
Google's Gemini 3 (Pro/Flash)
xAI's Grok 4 (Reasoning/Non-reasoning/Code/Mini)
Open source: Llama, Mistral, DeepSeek

NOTE that the interface and model are separable!

Free vs Paid Options

Free or free for students:

Claude in VS Code - The smaller Claude models (Haiku) work without an account in agent mode
GitHub Copilot - Free for students with GitHub Education pack
ChatGPT - Free tier available
Claude.ai - Free tier with usage limits
Google Colab AI - Free tier available
Cursor - Free tier with limited requests

Paid options:

Claude Pro ($20/month) - Access to larger models, more usage
ChatGPT Plus ($20/month) - Latest model, plugins, more features
Cursor Pro ($20/month) - More AI requests, better models

Modes of AI-Assisted Coding

Modern AI coding tools have different modes for different tasks:

Chat / Ask mode

You ask questions, get answers
Good for: understanding concepts, explaining errors, brainstorming

Edit mode

AI modifies specific code you highlight
Good for: refactoring, fixing bugs in specific places

Agent / Composer mode

AI autonomously makes changes across multiple files
Good for: larger features, multi-file refactors
More powerful but needs more oversight!

Pro tip: Help the AI help you

Most tools support project-level instructions (.cursorrules, CLAUDE.md, etc.)
Use these to specify coding style, conventions, preferred libraries
Point the agent to your README or docs: "Read README.md first to understand the project structure"
The more context you provide upfront, the less you'll need to correct later

A Workflow for AI-Assisted Coding

When working with AI on non-trivial tasks:

Step 1: Propose

Present your goal with context
Ask AI to suggest approaches and raise concerns
Don't start coding yet!

Step 2: Refine

Answer questions, discuss edge cases
Clarify ambiguities before implementation
Don't start coding yet!

Step 3: Execute

Define clear success criteria ("all tests pass", "API returns 200")
Give permission to proceed

Step 4: Supervise

Make sure the output is as expected
Understand the code generated - if you don't, ask the AI to explain!

"Treat the AI like a slightly dopey intern": "Write a function that..." is okay. "Write a function that does X, without using external dependencies, returning a dict with keys a, b, c" is better. Vague prompts produce vague results.

The Cup of Tea Test

Can you define success criteria clearly enough that you could walk away while the AI iterates?

Good success criteria:

"All tests pass"
"API returns 200 with valid JSON"
"Script runs without errors and produces output.csv"

Vague criteria (harder for AI to iterate on):

"Make it work"
"Clean this up"
"Fix the bug"

Write tests first, then tell the AI "make these tests pass without changing them."

Why Git Matters Even More Now

When AI can make sweeping changes to your code, version control becomes critical.

Git is your safety net:

You can always roll back if AI breaks something
You can see exactly what changed
You can experiment fearlessly

Good habits:

Commit before asking AI to make big changes
Review diffs carefully before committing AI-generated code
Use branches for experimental AI-assisted features

The undo button for AI mistakes = git checkout or git revert

When to Be Skeptical

AI coding tools are powerful, but they have blind spots.

Be extra careful with:

Security-sensitive code (authentication, encryption, input validation)
Database operations (SQL injection is common in AI-generated code)
API keys and credentials (AI sometimes hardcodes these!)
Dependencies (AI can "hallucinate" packages that don't exist)
Anything you don't understand (if you can't explain it, you can't debug it)

What Does the Research Say?

A 2025 study of open-source projects using AI coding assistants found:

Initial velocity gains:

281% increase in lines of code added in the first month
But only 28.6% sustained increase after two months

Quality concerns:

30% increase in static analysis warnings
41% increase in code complexity
Quality declines persisted even after velocity gains faded

Discussion: What might explain these patterns? What does "more code" actually mean for a project? How might this affect how teams should adopt AI tools?

Red Flags During AI Sessions

Stop and reassess if you notice:

Very long conversations - AI loses context over extended chats
Unexplained deletion of tests or code - AI may "simplify" things you need
AI forgetting your original goals - Context drift is real
Circular problem-solving - Same approaches failing repeatedly

Recovery tactics:

Revert and try with adjusted prompts
Ask AI: "What's going wrong here? What are you trying to accomplish?"
Start a fresh conversation with a summary of what you need
Use git commits like video game save points - checkpoint frequently!

Remember: Studies show AI-generated code tends to be more complex and harder to maintain. If the AI's solution feels convoluted, it probably is. Simpler is usually better.

Real Failures: The Tea App Breach (July 2025)

A women's dating advice app called Tea announced they had been "hacked."

72,000 images were exposed, including 13,000 government IDs from user verification.

What actually happened?

Nobody hacked them. The Firebase storage was left completely open with default settings. The AI-generated code didn't include any authorization policies.

The developers were "vibe-coding" - trusting AI to handle implementation without understanding security fundamentals.

More Cautionary Tales

The Replit Database Deletion: An AI agent was told to help develop a project. It decided the database "needed cleanup" and deleted it - violating a direct instruction prohibiting modifications.

Hallucinated Packages: AI sometimes invents package names that don't exist. Attackers have registered these fake package names with malicious code. If you blindly pip install what AI suggests...

The Statistics: A 2025 study found that 45% of AI-generated code contains security flaws. When given a choice between secure and insecure approaches, LLMs choose the insecure path nearly half the time.

The lesson: AI is a powerful assistant, not a replacement for understanding what your code does.

Activity: Build Something Fun with AI (10 min)

Pair up and use an AI tool to build something small and interactive in Python.

Process:

Open a new notebook or python script and AI tool
Decide what you want to build
Prompt the AI and iterate
We'll discuss - What went well? What didn't?

Ideas:

A magic 8-ball that answers questions
A text-based choose-your-own-adventure
A fortune cookie generator
A simple game (trivia, rock-paper-scissors, mad libs)
A password generator
A maze generator / solver

Debrief: What did you notice?

The Bottom Line on AI Dev Tools

Use them! They're incredibly powerful and will be part of your professional toolkit.

Stay critical. Verify everything, especially security-sensitive code.

Focus on understanding. If you can't explain the code, you don't own it.

Git is your friend. Commit often, review diffs, don't be afraid to revert.

You are responsible for the code you submit, regardless of who (or what) wrote it.

For more blog posts with frameworks and prompt examples see the "Week 2 Guide"

Part 2: A Taste of Classical NLP

The Landscape of NLP Tasks

NLP is a broad field. Here are some classic problems:

Classification - Is this email spam? Is this review positive or negative?

Sequence labeling - What part of speech is each word? Which words are names/places? (Historically solved with Hidden Markov Models)

Sequence-to-sequence - Translate English to French. Summarize this article.

Generation - Write the next word, sentence, or paragraph.

Today we'll focus on classification and generation - the two ends of the spectrum.

The Simplest Idea: Just Count Words

Bag of Words (BoW): Represent a document by which words appear and how often.

Document: "I love NLP. I love machine learning."

Vocabulary: [I, love, NLP, machine, learning]
Vector:     [2,   2,    1,      1,       1]

That's it. Count the words, ignore the order.

Why "Bag" of Words?

Because we throw the words in a bag and shake it up. Order is lost!

"Dog bites man" -> {dog: 1, bites: 1, man: 1}
"Man bites dog" -> {dog: 1, bites: 1, man: 1}

Same representation. Very different meanings.

This is a huge limitation. But BoW is fast, simple, and works surprisingly well for some tasks.

What Can You Do With BoW?

Once you have word counts, you have numbers. Now you can use any classifier!

Naive Bayes - The classic choice for text. Fast, simple, works surprisingly well for spam detection.

Logistic regression, SVM, random forests... - All work with BoW features.

Remember: BoW is NOT a model, it is just feature engineering. You're turning text into a table of numbers. After that, you can use whatever machine learning method you like.

Before You Count: Data Cleaning

Raw text is messy. Before building a BoW representation, what might we need to do?

Common Preprocessing Steps

Lowercasing - "The" and "the" should be the same word

Punctuation removal - "learning." and "learning" are the same

Stop word removal - "the", "a", "is" don't tell us much

Stemming - "running", "runs", "ran" all become "run"

Lemmatization - Like stemming but smarter ("better" becomes "good")

Which ones matter depends on your task!

TF-IDF to address word rarity

With raw word counts, common words dominate. "The" appears in almost every document but tells us nothing about the topic.

TF-IDF (Term Frequency–Inverse Document Frequency) is one way to address this:

$TF-IDF (t, d) = TF (t, d) \times lo g (\frac{N}{DF ( t )})$

Where:

TF(t, d) = how often term t appears in document d
DF(t) = how many documents contain term t
N = total number of documents

Words that appear frequently in one document but rarely across all documents get high scores.

BoW in Practice

From Counting to Generating: N-grams

Step 1: Count transitions

Go through your training text and count: "After word X, what word Y appeared?" (This is bi-grams - we can also do tri-grams or higher)

Step 2: Convert to probabilities

If "love" appeared 10 times, and it was followed by "NLP" 3 times and "machine" 7 times:

P(NLP | love) = 3/10 = 30%
P(machine | love) = 7/10 = 70%

Step 3: Generate

Start with a word. Roll the dice based on probabilities. Repeat!

Let's Build One!

Training text: "I love NLP. I love machine learning."

Bigram counts:

After "I": "love" appears 2 times (100%)
After "love": "NLP" (1 time, 50%), "machine" (1 time, 50%)
After "machine": "learning" (1 time, 100%)

To generate: Start with "I", then pick the next word based on probabilities.

Demo: N-gram Text Generation

Let's see this in action with a Python demo.

What to watch for:

How do the probabilities come from the training text?
What kinds of sentences does it generate?
Do you recognize any fragments?

Activity: Talk to ELIZA (5 min)

ELIZA was created in 1966 - one of the first "chatbots." It was convincing enough that some users believed they were talking to a real therapist.

Try it: Go to njit.edu/~ronkowit/eliza.html or search "ELIZA chatbot online"

As you chat, think about:

What patterns do you notice in ELIZA's responses?
How do you think it works? (Hint: no neural networks existed in 1966!)
What tricks make it seem more intelligent than it is?

What N-grams Can't Do

"The trophy would not fit in the suitcase because it was too large."

What is "it"? The trophy? The suitcase?

N-grams struggle with:

Long-range dependencies (the Markov assumption is too limiting)
Generating novel combinations (only what we've seen)
Understanding meaning (no semantics, just statistics)

LLMs solve these problems. We'll see how later in the course.

Why Learn This Old Stuff?

The limitations motivate the innovations - When we study attention and transformers, you'll see they directly solve the problems n-grams couldn't: long-range dependencies, semantic understanding, generalization beyond training data.
Simplicity - sometimes simple methods are good enough (and faster!). Not every problem needs GPT-5.
Building blocks - ideas like tokenization, probability distributions over sequences, and context windows carry directly into modern architectures.
Debugging intuition - understanding why models fail helps you prompt better and catch errors.
Interview questions - you'd be surprised how often these come up

What We Learned Today

AI-assisted development:

Different phases: brainstorming, coding, debugging, understanding
Free and paid tools available
Different modes: chat, edit, agent
Git as your safety net
When to be skeptical (security, credentials, hallucinated packages)

Classical NLP:

Bag of Words: count words, ignore order
N-grams and Markov chains: predict next word from recent history
Limitations that motivate modern methods

Looking Ahead

Lab 1 due Friday: Explore text classification and n-gram generation

Monday: Deep learning fundamentals

How neural networks learn
Backpropagation and gradient descent
If you're new - check out the Week 3 guide for resources to view before class

Wednesday: Tokenization

How do LLMs split text into pieces?
Subword tokenization (BPE)
Why tokenization affects what models can and can't do

Lauren's CDS593 Materials