Lecture 2 - AI-Assisted Development & Classical NLP

Welcome back!

Last time: We explored what LLMs are, their history, how we're going to work together this term

Today:

  • AI-assisted development
  • A bit of classical NLP (BoW and ngrams)

Highlights from the syllabus activity and other logistics

  • The "cite your friends" question

Concerns:

  • Lack of deep learning background
  • Strict late policy
  • Exams (percentage, no notes)
  • Grading fairly given tool access, collaborators
  • Project open-endedness (choosing one, grading given that)
  • Lots of deliverables
  • "Ethical concern"?

You liked:

  • Project-based structure
  • Clear expectations
  • No blackboard?
  • AI use allowed
  • No final
  • Coffee chats!

Questions

  • Time commitment
  • iPads / how to take notes
  • How labs and portfolio pieces work
  • Suggestions for books / other resources
  • Forming teams

Both liked and disliked:

  • No laptop policy
  • Oral exam redo option

Logistics:

  • Swapping L3 and L4
  • Renaming / numbering labs and reflections (see schedule)

How to Report a Problem (Life Skill!)

When you message us (or a future coworker/manager) about a technical issue, include:

1. What you did - Be specific!

  • What tool/command/interface?
  • What did you click or type?
  • Any other context (network connections, previous actions)

2. What you expected - What should have happened?

3. What actually happened - Error messages, screenshots, exact text

4. What you've tried yourself - Steps you've taken to debug that have failed

Bad: "Torch isn't working for me"

Good: "I ran python train.py in VS Code's terminal on my Mac. I expected it to start training, but instead I got:

Traceback (most recent call last):
  File "train.py", line 1, in <module>
    import torch
ModuleNotFoundError: No module named 'torch'

I installed PyTorch yesterday using pip install torch. I'm using Python 3.11 and I'm not using a virtual environment. When I run which python I get /usr/bin/python3. I tried running pip install torch again and it says 'Requirement already satisfied.' I also tried pip3 install torch with the same result."

Bad: "I can't push to GitHub"

Good: "I clicked 'Push' in GitHub Desktop last night. I expected my commits to appear on github.com, but instead I got:

Updates were rejected because the remote contains work that you do not have locally.

I'm the only one working on this repo and I haven't made changes from another computer. When I run git status I see Your branch and 'origin/main' have diverged, and have 1 and 1 different commits each. I'm not sure how the remote got a different commit since I haven't pushed from anywhere else."

Ice-breaker

Question: What's one thing you used an AI tool for in the last week?

Share with a neighbor, then we'll hear a few examples.

Part 1: AI-Assisted Development

How AI Can Help You Code

AI tools can assist at many different stages of development:

Brainstorming and planning

  • "What's a good architecture for a web scraper?"
  • "What libraries should I use for text processing in Python?"

Writing code

  • Autocomplete, generating functions, boilerplate

Debugging and fixing errors

  • "Why am I getting this error?" with the stack trace

Understanding unfamiliar code

  • "Explain what this function does" when joining a new project

Writing tests and documentation

  • "Write unit tests for this function"
  • "Add docstrings to these methods"

The Tools Landscape

There are two things to understand: the interface (how you interact) and the model (the AI doing the work).

Interfaces / IDEs:

  • Cursor - AI-native IDE (fork of VS Code), $20/month or free tier
  • VS Code + Extensions - Claude extension, GitHub Copilot extension
  • Chat interfaces - ChatGPT, Claude.ai, Gemini

Underlying Models:

  • Anthropic's Claude 4.5 - Opus, Sonnet, Haiku
  • OpenAI's GPT-5.2 (Thinking/Pro/Instant/Codex)
  • Google's Gemini 3 (Pro/Flash)
  • xAI's Grok 4 (Reasoning/Non-reasoning/Code/Mini)
  • Open source: Llama, Mistral, DeepSeek

NOTE that the interface and model are separable!

Free vs Paid Options

Free or free for students:

  • Claude in VS Code - The smaller Claude models (Haiku) work without an account in agent mode
  • GitHub Copilot - Free for students with GitHub Education pack
  • ChatGPT - Free tier available
  • Claude.ai - Free tier with usage limits
  • Google Colab AI - Free tier available
  • Cursor - Free tier with limited requests

Paid options:

  • Claude Pro ($20/month) - Access to larger models, more usage
  • ChatGPT Plus ($20/month) - Latest model, plugins, more features
  • Cursor Pro ($20/month) - More AI requests, better models

Modes of AI-Assisted Coding

Modern AI coding tools have different modes for different tasks:

Chat / Ask mode

  • You ask questions, get answers
  • Good for: understanding concepts, explaining errors, brainstorming

Edit mode

  • AI modifies specific code you highlight
  • Good for: refactoring, fixing bugs in specific places

Agent / Composer mode

  • AI autonomously makes changes across multiple files
  • Good for: larger features, multi-file refactors
  • More powerful but needs more oversight!

Pro tip: Help the AI help you

  • Most tools support project-level instructions (.cursorrules, CLAUDE.md, etc.)
  • Use these to specify coding style, conventions, preferred libraries
  • Point the agent to your README or docs: "Read README.md first to understand the project structure"
  • The more context you provide upfront, the less you'll need to correct later

A Workflow for AI-Assisted Coding

When working with AI on non-trivial tasks:

Step 1: Propose

  • Present your goal with context
  • Ask AI to suggest approaches and raise concerns
  • Don't start coding yet!

Step 2: Refine

  • Answer questions, discuss edge cases
  • Clarify ambiguities before implementation
  • Don't start coding yet!

Step 3: Execute

  • Define clear success criteria ("all tests pass", "API returns 200")
  • Give permission to proceed

Step 4: Supervise

  • Make sure the output is as expected
  • Understand the code generated - if you don't, ask the AI to explain!

"Treat the AI like a slightly dopey intern": "Write a function that..." is okay. "Write a function that does X, without using external dependencies, returning a dict with keys a, b, c" is better. Vague prompts produce vague results.

The Cup of Tea Test

Can you define success criteria clearly enough that you could walk away while the AI iterates?

Good success criteria:

  • "All tests pass"
  • "API returns 200 with valid JSON"
  • "Script runs without errors and produces output.csv"

Vague criteria (harder for AI to iterate on):

  • "Make it work"
  • "Clean this up"
  • "Fix the bug"

Write tests first, then tell the AI "make these tests pass without changing them."

Why Git Matters Even More Now

When AI can make sweeping changes to your code, version control becomes critical.

Git is your safety net:

  • You can always roll back if AI breaks something
  • You can see exactly what changed
  • You can experiment fearlessly

Good habits:

  • Commit before asking AI to make big changes
  • Review diffs carefully before committing AI-generated code
  • Use branches for experimental AI-assisted features

The undo button for AI mistakes = git checkout or git revert

When to Be Skeptical

AI coding tools are powerful, but they have blind spots.

Be extra careful with:

  • Security-sensitive code (authentication, encryption, input validation)
  • Database operations (SQL injection is common in AI-generated code)
  • API keys and credentials (AI sometimes hardcodes these!)
  • Dependencies (AI can "hallucinate" packages that don't exist)
  • Anything you don't understand (if you can't explain it, you can't debug it)

What Does the Research Say?

A 2025 study of open-source projects using AI coding assistants found:

Initial velocity gains:

  • 281% increase in lines of code added in the first month
  • But only 28.6% sustained increase after two months

Quality concerns:

  • 30% increase in static analysis warnings
  • 41% increase in code complexity
  • Quality declines persisted even after velocity gains faded

Discussion: What might explain these patterns? What does "more code" actually mean for a project? How might this affect how teams should adopt AI tools?

Red Flags During AI Sessions

Stop and reassess if you notice:

  • Very long conversations - AI loses context over extended chats
  • Unexplained deletion of tests or code - AI may "simplify" things you need
  • AI forgetting your original goals - Context drift is real
  • Circular problem-solving - Same approaches failing repeatedly

Recovery tactics:

  • Revert and try with adjusted prompts
  • Ask AI: "What's going wrong here? What are you trying to accomplish?"
  • Start a fresh conversation with a summary of what you need
  • Use git commits like video game save points - checkpoint frequently!

Remember: Studies show AI-generated code tends to be more complex and harder to maintain. If the AI's solution feels convoluted, it probably is. Simpler is usually better.

Real Failures: The Tea App Breach (July 2025)

A women's dating advice app called Tea announced they had been "hacked."

72,000 images were exposed, including 13,000 government IDs from user verification.

What actually happened?

Nobody hacked them. The Firebase storage was left completely open with default settings. The AI-generated code didn't include any authorization policies.

The developers were "vibe-coding" - trusting AI to handle implementation without understanding security fundamentals.

More Cautionary Tales

The Replit Database Deletion: An AI agent was told to help develop a project. It decided the database "needed cleanup" and deleted it - violating a direct instruction prohibiting modifications.

Hallucinated Packages: AI sometimes invents package names that don't exist. Attackers have registered these fake package names with malicious code. If you blindly pip install what AI suggests...

The Statistics: A 2025 study found that 45% of AI-generated code contains security flaws. When given a choice between secure and insecure approaches, LLMs choose the insecure path nearly half the time.

The lesson: AI is a powerful assistant, not a replacement for understanding what your code does.

Activity: Build Something Fun with AI (10 min)

Pair up and use an AI tool to build something small and interactive in Python.

Process:

  1. Open a new notebook or python script and AI tool
  2. Decide what you want to build
  3. Prompt the AI and iterate
  4. We'll discuss - What went well? What didn't?

Ideas:

  • A magic 8-ball that answers questions
  • A text-based choose-your-own-adventure
  • A fortune cookie generator
  • A simple game (trivia, rock-paper-scissors, mad libs)
  • A password generator
  • A maze generator / solver

Debrief: What did you notice?






The Bottom Line on AI Dev Tools

Use them! They're incredibly powerful and will be part of your professional toolkit.

Stay critical. Verify everything, especially security-sensitive code.

Focus on understanding. If you can't explain the code, you don't own it.

Git is your friend. Commit often, review diffs, don't be afraid to revert.

You are responsible for the code you submit, regardless of who (or what) wrote it.

For more blog posts with frameworks and prompt examples see the "Week 2 Guide"

Part 2: A Taste of Classical NLP

The Landscape of NLP Tasks

NLP is a broad field. Here are some classic problems:

Classification - Is this email spam? Is this review positive or negative?

Sequence labeling - What part of speech is each word? Which words are names/places? (Historically solved with Hidden Markov Models)

Sequence-to-sequence - Translate English to French. Summarize this article.

Generation - Write the next word, sentence, or paragraph.

Today we'll focus on classification and generation - the two ends of the spectrum.

The Simplest Idea: Just Count Words

Bag of Words (BoW): Represent a document by which words appear and how often.

Document: "I love NLP. I love machine learning."

Vocabulary: [I, love, NLP, machine, learning]
Vector:     [2,   2,    1,      1,       1]

That's it. Count the words, ignore the order.

Why "Bag" of Words?

Because we throw the words in a bag and shake it up. Order is lost!

"Dog bites man" -> {dog: 1, bites: 1, man: 1}
"Man bites dog" -> {dog: 1, bites: 1, man: 1}

Same representation. Very different meanings.

This is a huge limitation. But BoW is fast, simple, and works surprisingly well for some tasks.

What Can You Do With BoW?

Once you have word counts, you have numbers. Now you can use any classifier!

Naive Bayes - The classic choice for text. Fast, simple, works surprisingly well for spam detection.

Logistic regression, SVM, random forests... - All work with BoW features.

Remember: BoW is NOT a model, it is just feature engineering. You're turning text into a table of numbers. After that, you can use whatever machine learning method you like.

Before You Count: Data Cleaning

Raw text is messy. Before building a BoW representation, what might we need to do?

Common Preprocessing Steps

Lowercasing - "The" and "the" should be the same word

Punctuation removal - "learning." and "learning" are the same

Stop word removal - "the", "a", "is" don't tell us much

Stemming - "running", "runs", "ran" all become "run"

Lemmatization - Like stemming but smarter ("better" becomes "good")

Which ones matter depends on your task!

TF-IDF to address word rarity

With raw word counts, common words dominate. "The" appears in almost every document but tells us nothing about the topic.

TF-IDF (Term Frequency–Inverse Document Frequency) is one way to address this:

Where:

  • TF(t, d) = how often term t appears in document d
  • DF(t) = how many documents contain term t
  • N = total number of documents

Words that appear frequently in one document but rarely across all documents get high scores.

BoW in Practice

From Counting to Generating: N-grams

Step 1: Count transitions

Go through your training text and count: "After word X, what word Y appeared?" (This is bi-grams - we can also do tri-grams or higher)

Step 2: Convert to probabilities

If "love" appeared 10 times, and it was followed by "NLP" 3 times and "machine" 7 times:

  • P(NLP | love) = 3/10 = 30%
  • P(machine | love) = 7/10 = 70%

Step 3: Generate

Start with a word. Roll the dice based on probabilities. Repeat!

Let's Build One!

Training text: "I love NLP. I love machine learning."

Bigram counts:

  • After "I": "love" appears 2 times (100%)
  • After "love": "NLP" (1 time, 50%), "machine" (1 time, 50%)
  • After "machine": "learning" (1 time, 100%)

To generate: Start with "I", then pick the next word based on probabilities.

Demo: N-gram Text Generation

Let's see this in action with a Python demo.

What to watch for:

  • How do the probabilities come from the training text?
  • What kinds of sentences does it generate?
  • Do you recognize any fragments?

Activity: Talk to ELIZA (5 min)

ELIZA was created in 1966 - one of the first "chatbots." It was convincing enough that some users believed they were talking to a real therapist.

Try it: Go to njit.edu/~ronkowit/eliza.html or search "ELIZA chatbot online"

As you chat, think about:

  • What patterns do you notice in ELIZA's responses?
  • How do you think it works? (Hint: no neural networks existed in 1966!)
  • What tricks make it seem more intelligent than it is?

What N-grams Can't Do

"The trophy would not fit in the suitcase because it was too large."

What is "it"? The trophy? The suitcase?

N-grams struggle with:

  • Long-range dependencies (the Markov assumption is too limiting)
  • Generating novel combinations (only what we've seen)
  • Understanding meaning (no semantics, just statistics)

LLMs solve these problems. We'll see how later in the course.

Why Learn This Old Stuff?

  1. The limitations motivate the innovations - When we study attention and transformers, you'll see they directly solve the problems n-grams couldn't: long-range dependencies, semantic understanding, generalization beyond training data.

  2. Simplicity - sometimes simple methods are good enough (and faster!). Not every problem needs GPT-5.

  3. Building blocks - ideas like tokenization, probability distributions over sequences, and context windows carry directly into modern architectures.

  4. Debugging intuition - understanding why models fail helps you prompt better and catch errors.

  5. Interview questions - you'd be surprised how often these come up

What We Learned Today

AI-assisted development:

  • Different phases: brainstorming, coding, debugging, understanding
  • Free and paid tools available
  • Different modes: chat, edit, agent
  • Git as your safety net
  • When to be skeptical (security, credentials, hallucinated packages)

Classical NLP:

  • Bag of Words: count words, ignore order
  • N-grams and Markov chains: predict next word from recent history
  • Limitations that motivate modern methods

Looking Ahead

Lab 1 due Friday: Explore text classification and n-gram generation

Monday: Deep learning fundamentals

  • How neural networks learn
  • Backpropagation and gradient descent
  • If you're new - check out the Week 3 guide for resources to view before class

Wednesday: Tokenization

  • How do LLMs split text into pieces?
  • Subword tokenization (BPE)
  • Why tokenization affects what models can and can't do