Check your inbox right now. Or rather, don’t. Because there’s a good chance you received 50 emails today and only saw 15 of them. Your spam filter handled the rest. It caught the phishing attempts, the fake invoices, the “congratulations you’ve won” messages. All without you writing a single rule about what spam looks like.
Nobody else wrote those rules either. No programmer sat down and coded “if email contains ‘free money,’ mark as spam.” The filter learned. It looked at billions of emails, spam and not-spam, and figured out the patterns on its own.
That word, “learned,” is doing a lot of heavy lifting. This module unpacks what it actually means.
In Module 1, we established that the AI tools you use daily aren’t search engines or databases. They’re pattern recognition systems trained on massive amounts of data. This module goes deeper into what “trained” actually means. That understanding is the foundation for everything that follows in this guide.
Rules vs Patterns — The Fundamental Shift
Imagine you’re building a spam filter the traditional way. You start simple: if an email contains “free money,” flag it. Works for a week. Then spammers write “fr3e m0ney.” Your rule is useless. So you add another rule. And another. Spammers adapt again. You’re in an arms race, and you’re losing, because humans can invent new ways to say “free money” faster than you can write rules to catch them.
This is the wall that traditional programming hits. For every situation, a human has to anticipate the scenario and write a specific rule. It works when the problem is well-defined (calculate tax on this invoice), but it breaks down the moment the problem involves variety, ambiguity, or adversaries.
Machine learning takes a completely different approach. Instead of writing rules, you show the system millions of examples. Here are 10 million emails. These ones are spam. These ones aren’t. Figure out the pattern.
And it does. The system discovers signals no human would think to code: combinations of sender reputation, email formatting, link structure, timing, and language patterns that, taken together, separate spam from legitimate mail with pretty impressive accuracy. Google’s Gmail filter hits 99.9% using this approach, up from 99.5% before they added neural networks. That 0.4% gap doesn’t sound like much until you consider Gmail processes billions of emails daily.
![]()
The shift in one sentence:
Traditional programming: humans write rules, computers follow them. Machine learning: humans provide examples, computers discover the rules.
That distinction goes a long way toward explaining how modern AI works. 72% of companies are now building on it in some form, according to industry surveys.
Key Term: Machine Learning (ML) — A subset of AI where systems learn patterns from data instead of following hand-written rules. Rather than programming “if X then Y,” you show the system millions of examples and let it discover the patterns.
Three Types of Learning — Three Different Questions
Machine learning isn’t one technique. There are different flavours, and the easiest way to make sense of them is as three different questions you can ask of data.
Supervised learning: “Here are examples with answers — learn the pattern.”
Think of it like training a new team member. You hand them 10,000 customer support tickets, each one already tagged as “billing issue,” “technical problem,” or “feature request.” They read through them, build intuition, and eventually can classify new tickets on their own. That’s supervised learning. The “supervision” is the labels. Someone already sorted the examples.
This is the type behind most AI tools knowledge workers use. Email classification, medical image diagnosis, fraud detection, document categorisation. All supervised learning. The model was shown millions of labelled examples and learned the patterns that distinguish one category from another.
Unsupervised learning: “Here’s data with no labels — find the structure.”
Now imagine handing that same team member your entire customer database with no categories at all. No labels, no tags, no sorting. Just raw data. And you say: “Find the patterns.”
They might come back and tell you there are five natural groupings in your customer base: budget-conscious buyers, premium-loyal customers, seasonal-only purchasers, and two more you’d never identified. That’s unsupervised learning. Nobody told the system what to look for. It found structure you didn’t know was there. Used for customer segmentation, anomaly detection, and spotting fraud patterns that don’t fit any existing category.
Reinforcement learning: “Try things — I’ll tell you when you’re getting warmer.”
This one works differently. Imagine training a robot to navigate a warehouse. You don’t give it a map. You let it try paths. When it finds an efficient route, it gets a reward signal. When it bumps into a shelf, it gets a penalty. Over thousands of attempts, it gets pretty good at navigating. Not because anyone programmed the optimal path, but because it discovered one through trial and error.
This is how AlphaGo learned to beat the world’s best Go players. It’s how Amazon trains warehouse robots. And it’s how recommendation engines adapt to your behaviour. Every click, skip, and lingering pause is a signal that shapes what gets shown next.
![]()
Try This: Think about a decision you make at work that involves pattern recognition. Evaluating a CV, assessing a report, spotting a risk in a proposal. How would you teach someone new to do it? You wouldn’t hand them a 50-page rulebook. You’d show them examples. “Here’s a strong CV. Here’s a weak one. Here’s why.” You’d let them build intuition from data. That’s supervised learning. You already think like a machine learning system, you just do it slower.
Training Data — Where AI’s Knowledge (and Biases) Come From
Ever asked an AI tool about a topic specific to your country or industry and got a response that felt generically American? That’s not the model being opinionated. It’s the training data.
What goes in determines what comes out. An ML model learns from the data it’s trained on. If that data over-represents certain perspectives, the model will too. This isn’t a moral judgment. It’s a mechanical reality, as predictable as a mirror reflecting whatever’s placed in front of it.
The consequences are well-documented and specific:
Amazon’s recruiting tool was trained on 10 years of the company’s hiring data. Because that history skewed male, the system learned to downgrade resumes containing the word “women’s” (as in “women’s chess club captain”). Amazon scrapped the tool entirely.
Facial recognition systems studied by MIT researcher Joy Buolamwini misclassified gender in 1% of white men but up to 35% of Black women. The training datasets contained far more white male faces. The system worked brilliantly for the people who looked like the training data and failed for everyone else.
A UNESCO study found that major language models associate women with “home” and “family” four times more often than men, while linking male-sounding names to “business,” “career,” and “executive.” The internet text these models were trained on reflects decades of real-world gender patterns.
Misconception: “AI bias is a bug that will be fixed in the next update.” Reality: Bias is structural. It reflects the data, and the data reflects the world, with all its historical imbalances. Models can be improved, made more representative, and tested for specific biases. But perfect neutrality isn’t achievable. Which means the human reviewing AI output has a permanent job.
Why does this matter practically? When you use an AI tool and the output feels skewed (toward a particular culture, assumption, or perspective), you’re seeing the statistical weight of the training data. Recognising that gives you something valuable: the ability to notice it, name it, and adjust for it. I reckon the practitioners who get the best results from AI are the ones who’ve calibrated their instinct for when the training data is doing the talking.
Key Term: Training Data — The dataset used to teach a machine learning model its patterns. Training data quality — its accuracy, representativeness, and balance — is the single biggest factor in whether the resulting model is useful or harmful.
Neural Networks — The Concept (Not the Maths)
Your phone recognises your face. Not by measuring the distance between your eyes and comparing it to a database. Something a bit more interesting is happening.
A neural network is layers of simple pattern detectors stacked on top of each other. Each layer handles one level of complexity, passing its findings up to the next.
First layer: edges. The network spots basic lines, curves, and contrasts. Just geometry. Nothing meaningful yet.
Next layer: shapes. Those edges get combined into recognisable features. The curve of a nostril, the outline of an eyebrow, the shadow under a cheekbone.
Next: objects. The features combine into faces, expressions, identities. Your phone doesn’t follow a rulebook of facial measurements. It learned to recognise you through layers of increasingly complex pattern detection.
The “deep” in deep learning just means more layers. A system with 3 layers is shallow. A system with 100 layers is deep. Modern AI models stack many layers, each one building on the patterns discovered by the layer below.
To get a sense of scale: GPT-4, the model behind ChatGPT, has an estimated 1.7 trillion parameters. Think of parameters as knobs. During training, the system adjusted 1.7 trillion knobs until it got good at predicting language patterns. That’s a fair bit of tuning. And it all happened before you ever typed a prompt.
Something worth remembering: the “learning” already happened. By the time you use a model, the knobs are set. You’re interacting with a frozen snapshot of patterns discovered in data. The model isn’t learning from your conversation (unless the provider explicitly retrains on user data). It’s applying what it already learned during training.
Tip: You don’t need to understand backpropagation, gradient descent, or any of the maths behind neural networks. What matters is the mental model: these systems build understanding in layers, from simple to complex. That’s why they can be surprisingly good at pattern recognition (billions of tuned detectors working together) and pretty much blind to things outside what they were trained on.
This layered pattern detection is the engine underneath every AI tool you use. The next module explains what happens when you point that engine specifically at language.
Apply This Monday
Pick an AI tool you use at work. Ask it something specific to your industry or region: a question about NZ-specific regulations, a niche professional standard, or local market conditions. Something that wouldn't dominate the internet text it was trained on. Notice if the response defaults to US-centric or generic assumptions. If it does, that's training data bias at work. Write down what you notice. You're building the calibration you'll rely on every time you evaluate AI output.
