This is the fourth installment of AI Fluency Corner, a 16-part weekly series that builds one connected mental model of artificial intelligence (AI) in plain language.
Open YouTube. Before you can type a word, the video is already waiting for you. When you open Gmail, some messages are in spam. Your banking app will query your transactions before you even notice the notification. None of this is speculation. It’s machine learning, and all three are done at the same time in three different types of machine learning.
Last week, we looked at algorithms, step-by-step instructions that turn inputs into outputs. This week will bring twists and turns. In many AI systems, these instructions are not written in advance. They emerge from the data.
What is machine learning actually?
Machine learning is a branch of AI in which systems improve tasks by finding patterns in data, rather than following rules written for every scenario. Instead of a developer writing, “If you see this phrase, mark it as spam,” the system examines thousands of emails that have already been labeled as spam or genuine and learns the fingerprints that differentiate emails.
This is a reversal. From following instructions to discovering patterns. Traditional programming requires humans to anticipate every case. Machine learning allows the system to find what’s important in the evidence and improve as new evidence arrives.
3 types, 3 different questions
Machine learning is not a single technology. There are three main types, and knowing which one applies to you will change the questions worth asking.
- supervised learning Most common in business. The system is trained on labeled examples, data for which the correct answer is already known. The credit model displays past applications, marks each as paid or in default, and learns the relationship between inputs and outcomes. Each time you click Report Spam, you’ll be provided with a sample new label to adjust your filters in real time. The reliability of the model is comparable to historical labels. If there was a bias in past decisions, the model will inherit it and scale it. Labels are instructions disguised as history.
- No monitoring learn It also works without labels. The system is given raw data and asked to find structures within it. Retailer models can group customers by purchasing rhythm and promotional response. Clusters arise from behavior rather than from a priori ideas about what segments should exist. Powerful for discovery. There is no labeled ground truth to test against, which makes verification difficult.
- reinforcement learn Training through interaction. The system performs actions, receives signals (rewards or penalties), and adjusts. A chatbot that improves each time an escalation is avoided and a pricing engine that updates based on market reactions both run on this principle. The reward function is a strategy. If the objective is not well defined, the model will optimize to the wrong thing.
where this is already running
FNB’s banking app uses a supervised model trained on millions of account histories to score your financial health. Comparisons that place you below your peers reflect patterns from labeled results, not human analysts reviewing your account.
Standard Bank’s anti-money laundering system uses unsupervised models to detect anomalous clusters across transaction networks. This system was not given a suspicious system definition. I found one by first establishing what normal looks like.
Takealot’s recommendation engine groups buyers by behavioral similarities powered by conversion signals. It didn’t know you when you arrived. Learned it from everyone who arrived before you.
Credit providers are using a combination of supervised and unsupervised models to manage informal and gig economy risks that could not be adequately assessed using older rules-based systems.
Three questions before trusting the output
Machine learning produces output that appears reliable. Scores, shortlists, recommendations and risk flags. Understanding the type behind each will change how much trust you deserve.
For supervised models, ask who labeled the examples and how. Labels such as “successful adoption” can mask assumptions about what success means at a particular point in time. For unsupervised models, ask what the clusters actually represent. Grouping customers similarly in your data does not actually mean that they should be treated similarly. For reinforcement systems, ask what the reward is. Chatbots that are only rewarded for closing tickets quickly can frustrate customers with shallow answers.
When and what was it trained on? Models built before major economic changes may have applied patterns that no longer apply. Training data has an expiration date. It is rarely printed on the label.
How often is it updated? A model that is accurate at the time of deployment degrades as the world deviates from its training assumptions. Vendors who cannot answer this are selling past performance as current capabilities.
What this means for your job
Machine learning is no more than the business that built it. Scale whatever you find, whether it’s useful or not, to reflect the data, labels, purpose, and incentives you’re given. A supervised model trained on a biased hiring history will reproduce that history in bulk. Reinforcement systems that optimize for wrong rewards do very well with wrong outcomes.
Fluency doesn’t require you to build a model. You need to know which type is used, what it is taught and how often the lessons are updated. A model with no one monitoring it is not a system. This is an assumption and will run unattended. A pattern is only as useful as the world in which it was trained.
your tasks this week
Next time someone tells you that their product “uses machine learning,” ask these three questions: What data was it trained on? How often is it updated? The answer is more valuable than any product brochure.
• Mafinyani is a senior partner in financial engineering and artificial intelligence at Intellica Analytics, a firm specializing in finance, risk and applied technology. Next week: Neural networks and deep learning — why modern AI can see, hear, and transcribe, and what that means for the tools on sale.
