The Edge Case Is Where Your AI Project Dies

This entry is part 5 of 20 in the series The Augmented Human

TL;DR: An edge case is any situation outside the pattern an AI was trained on. Machine learning systems are structurally incapable of handling them, because the system has never seen them before and has no judgment about what to do when it doesn’t know. Edge cases cost a wrong answer at small scale, a lawsuit at medium scale, and a death at large scale. The only fix is a human in the loop wherever the cost of a single failure exceeds the convenience of automation how I keep a human in the loop. Map your edge cases before you deploy, not after a customer or a regulator finds them for you.

An edge case is the part of the job your AI doesn’t know how to do.

Not the part it does badly. The part it has never seen before, has no example of in its training data, and has no way to recognize that it’s in trouble. The part where the system gives an answer with full confidence and no idea that the answer is wrong.

Edge cases are not a bug in your AI rollout. They are the structural feature of every machine learning system that has ever shipped, and they are the reason most AI deployments either fail outright or quietly hurt the business that paid for them.

What an edge case actually is

Machine learning works by finding patterns in examples. For more, see mit research shows ChatGPT weakens your brain — a profession. You show the system a million pictures of cats labeled “cat” and a million pictures of dogs labeled “dog,” and it learns to tell them apart with high accuracy. The system isn’t reasoning about what makes a cat a cat. For more, see AI and writing. It’s matching patterns. When a new picture comes in, the system asks: does this look more like the cat pattern or the dog pattern?

That’s the entire mechanism. It works beautifully when the new picture is, in fact, a cat or a dog.

The edge case is the picture that’s a fox, a ferret, a stuffed animal, a cat-shaped cloud, or a dog in a cat costume. The system has never been shown those examples. It still has to give an answer. So it picks one, with full confidence, and gets it wrong.

Now scale that mechanism up to the actual jobs companies are deploying AI for. A customer service bot trained on thousands of routine billing questions hits an angry customer whose card was charged twice for a flight that got canceled, and the bot has no example of what to do. A legal AI trained on standard contracts hits a clause it has never seen. A medical AI trained on common presentations hits a rare condition that mimics a common one. Each system gives an answer, with full confidence, and gets it wrong.

Why no amount of training fixes this

The obvious response is: just train the system on more examples. If the AI is failing on angry customers, train it on more angry customers. If it’s failing on rare medical conditions, train it on more rare conditions.

This works, partially, on the cases you can predict. The problem is that the edge cases you can predict aren’t really the edge. The actual edge is the case nobody on the team thought to put in the training set, because nobody could imagine it.

And the world produces those constantly. Every new product, every new customer demographic, every new regulation, every new fraud pattern, every new emotional state a real human walks in with on a Tuesday morning creates new edges the system has never been shown. You cannot train your way out of this problem. You can only build the system to recognize when it’s outside what it knows and hand the job to a human.

That’s the only fix. Everything else is the executive equivalent of hoping.

The cost at three scales

The reason this matters is that the cost of an unhandled edge case doesn’t stay constant. It scales with what you put the AI in charge of.

Small scale: a wrong answer

At the bottom of the stack, an edge case is a wrong answer. A chatbot tells a customer the wrong shipping date. A search assistant invents a citation that doesn’t exist. A spreadsheet AI mislabels a column. The customer gets frustrated, the analyst catches the mistake, life goes on. Annoying, recoverable, and the cost is measured in time.

This is the level most people think AI failures live at. They don’t.

Medium scale: a lawsuit

The next level up, an edge case is a legal exposure. Air Canada deployed a customer service chatbot that invented a refund policy that didn’t exist. A customer relied on the chatbot’s promise, the company refused to honor it, the customer took them to court, and the court ruled that the company was responsible for what its chatbot said. The “the chatbot was wrong” defense did not work.

That’s the medium-scale edge case. A hallucinated answer your AI gave to a customer, with full confidence, that becomes a binding promise you have to either honor or fight in court. Multiply that across the volume of interactions a customer service AI handles, and the math gets ugly fast.

I get into the specific failure mode of confident invention in AI Hallucination: A Survival Guide for People Who Publish Under Their Own Name. The mechanism is the same whether the output is a customer service answer or a chapter of your book.

Large scale: a death

At the top of the stack, an edge case is a person’s life.

Self-driving cars are the most public version of this. The system handles every routine driving scenario it was trained on. The edge case is the pedestrian whose silhouette the model has never seen, the bicycle wobbling in a way the model has never been shown, the traffic backup caused by something the model can’t categorize. The cost of the unhandled edge case at this scale isn’t a wrong answer. It’s a human being.

Medical AI sits in the same category. So does any system that decides who gets a loan, who gets a job interview, who gets parole, who gets removed from a no-fly list. The cost of the system being confidently wrong, at scale, on the cases it was never trained for, can be a life destroyed or ended.

What this means for your rollout

Before you deploy any AI system into a real workflow, you have to answer one question honestly. What does the cost of a single failure look like, and who pays it?

If the answer is “the user has to retry the search,” fine. Ship the system, let it fail occasionally, and move on. The cost is small, the cost lands on the user, and the user has another search engine they can use.

If the answer is “the company has to honor an invented refund policy,” you have a problem the law has already decided. Build a human in the loop for any answer that creates an obligation. The cost of that human’s time is less than the cost of the lawsuit you will eventually lose.

If the answer is “a person gets hurt,” you cannot deploy this system without a human in the loop, full stop. The cost of getting this wrong is something money cannot get back, and no amount of training data ever solves the next edge case the world will produce.

The mistake every Klarna-style rollout makes is answering the cost question with what the AI handles 95 percent of the time. The question is what happens the other 5 percent.

The rule

Keep a human in the loop wherever the cost of one unhandled edge case exceeds the convenience of automating the routine ones.

That’s the whole rule. The work of adopting AI well is the work of mapping where that line falls for every system you deploy. Most rollouts fail because nobody did the mapping. They benchmarked the AI against the easy cases, took the savings, and waited for the hard ones to arrive.

The hard ones always arrive.

Frequently Asked Questions

What is an edge case in AI?
An edge case is any situation outside the pattern an AI was trained on. It’s not a case the system does badly, it’s a case the system has never seen before. Because machine learning works by matching new inputs to known patterns, the system still gives an answer when it encounters an edge case. It just has no way to know that the answer is wrong.
Why can’t you just train the AI on more examples?
You can train on more of the edge cases you can predict, but the actual edge is the case nobody thought to include because nobody could imagine it. The world produces those constantly, through new products, new customers, new regulations, new fraud patterns, new emotional states people show up with. You cannot train your way out of the problem. You can only build the system to recognize when it’s outside what it knows and hand the case to a human.
What’s the worst that can happen with an unhandled edge case?
It depends on what you put the AI in charge of. At small scale you get a wrong answer the user can recover from. At medium scale you get a legal exposure, like Air Canada’s chatbot inventing a refund policy the court forced the company to honor. At large scale, in systems like self-driving cars, medical diagnosis, or any system that decides about someone’s life, the cost of one unhandled edge case can be a person hurt or killed.
How do I know if my AI system has an edge case problem?
Every AI system has an edge case problem. The question is whether you’ve mapped where the failures will land and what they will cost. Run through your workflow and ask: what does the system do when it doesn’t know? If the answer is “it gives an answer anyway with full confidence,” you have an unmapped edge case waiting to find you. The fix is to build the system to recognize when it’s outside its training and route the case to a human, before the case finds the customer or the regulator instead.
When is it safe to deploy AI without a human in the loop?
When the cost of a single failure is genuinely small and lands on someone who can recover from it. A search engine giving an occasionally wrong result is fine. A chatbot answering FAQs about store hours is fine. The user retries, the cost is measured in seconds, and nobody gets hurt. The minute the cost rises to a binding promise, a regulated decision, or a person’s safety, you need a human in the loop. The convenience of automation is never worth the cost of one edge case you didn’t see coming.
What’s the one rule for AI deployment?
Keep a human in the loop wherever the cost of one unhandled edge case exceeds the convenience of automating the routine ones. That’s the whole rule. The work of adopting AI well is the work of mapping where that line falls for every system you deploy. Most rollouts fail because nobody did the mapping. They benchmarked the AI against the easy cases, took the savings, and waited for the hard ones to arrive. The hard ones always arrive.


📝 Disclaimer

The views and opinions expressed in this blog post are solely those of Richard Lowe and are based on personal experience and research. This content is for informational purposes only and should not be construed as professional legal, financial, accounting, or business advice. Always consult with qualified professionals before making important business or legal decisions. Richard Lowe is not a lawyer, accountant, or licensed professional advisor, and this content does not establish any professional relationship.

Leave a Reply

Your email address will not be published. Required fields are marked *