AI in Customer Service: What Actually Works

This entry is part 9 of 20 in the series The Augmented Human

TL;DR: Customer service is the single most public AI failure category in business right now, and the failure mode is always the same. Executives confuse “AI handles 70 percent of tickets” with “AI replaces the customer service team.” The remaining 30 percent are exactly the customers whose relationships matter most: the angry ones, the high-value ones, the ones with a legal exposure, the ones who need a human and have learned the system won’t give them one. Here’s where AI actually wins in customer service, where it ruins everything, and the checklist that separates a deployment from a disaster.

Of every category where companies are deploying AI right now, customer service is producing the most public failures. The doctor’s office chatbot that traps the patient. The Klarna playbook that fired 700 agents and is now rehiring at a higher cost. The airline chatbot that invented a refund policy a court forced the airline to honor.

The failures aren’t because customer service is uniquely hard for AI. They’re because customer service is uniquely visible. Every customer interaction is a moment where the company’s promise meets the customer’s actual problem, and every bad interaction propagates fast: to a review, to social media, to the customer’s network, to the customer’s lifetime value, to the brand.

So this is the category to get right. Here’s how it actually breaks down.

The work customer service actually does

Customer service is not one job. It’s at least four:

Routine resolution. Order status, return processing, password resets, billing questions about straightforward charges, store hours, product information that exists in the documentation. The bulk of inbound volume. Pattern-based, bounded, low-stakes, and answerable from existing information.

Complex resolution. Billing disputes that require judgment, returns outside the policy that might still be the right call, product issues that need engineering input, configuration help that depends on the customer’s specific setup. Slower, more variable, requires judgment, sometimes requires escalation.

Relationship repair. The angry customer. The customer with the legitimate complaint. The customer who has been wronged and needs the company to acknowledge it. The customer whose relationship with the brand is on fire and needs a human to walk into the room and put it out.

Edge cases and exceptions. The fraud pattern nobody has seen before. The unusual customer who needs something the system doesn’t account for. The legal exposure that needs human judgment about what to commit to. The press inquiry that came in through the customer service queue by mistake.

The mistake every failed AI deployment makes is treating all four as the same job. They aren’t. AI handles one of them well, helps with another, and breaks badly on the other two.

Where AI wins in customer service

Routine resolution. The bulk-volume, pattern-based, low-stakes inquiries that make up the majority of inbound tickets. Order status. Where’s my package. Reset my password. What time do you open. Return this item. Refund this charge that’s clearly a duplicate.

For this category, AI is genuinely better than a human in most ways. Faster. Available 24/7. Patient with repetitive questions. Consistent on policy. It doesn’t get tired, doesn’t have a bad day, doesn’t accidentally tell the customer something different from what the last agent told them.

A well-designed AI deflection system can handle 60 to 70 percent of total ticket volume on routine cases alone. That’s a real productivity gain that doesn’t require firing anyone, because the volume of routine tickets is what was burning out your human team in the first place.

Where AI helps but doesn’t replace

Complex resolution. AI works as an assistant here, not as the agent of record. The system reads the incoming ticket, pulls relevant context from the customer’s history, drafts a likely response, and surfaces the relevant policy or knowledge base articles. The human agent reviews, refines, makes the call, and sends.

This is the same pattern that works in legal AI, medical AI, and code-completion AI. The machine does the research and the boilerplate. The human does the judgment. Ticket throughput per agent doubles. Quality stays where it was, or goes up because the agents aren’t context-switching across 200 tickets a day.

The mistake here is removing the human from the loop because the AI’s first draft was correct 80 percent of the time. The other 20 percent contains the cases that destroy customer relationships, and the agent is the only line of defense against them.

Where AI ruins customer service

Relationship repair. This is where every failed deployment burns. The angry customer who needs to be heard, validated, and given a real response from a real human cannot get that from a chatbot, and they will not pretend otherwise. The customer with a legitimate complaint will not be satisfied by a faster wrong answer. The customer whose relationship is on fire will not be put back together by efficient deflection.

The Klarna reversal was about this. The customer satisfaction collapse wasn’t because the AI got the easy 70 percent wrong. It was because the hard 30 percent had nowhere to go. The customers whose relationships mattered most got the worst service the company had ever provided, because the humans who used to handle those customers had been replaced by a chatbot designed for the easy work.

If your AI customer service strategy doesn’t have a fast, friction-free, no-questions-asked path for the customer to reach a human, you are not running an AI deployment. You are running a customer-departure machine. The doctor’s office chatbot that traps the patient for five minutes is a small version of this. The Klarna version is the same machine at scale. Why Most AI Rollouts Fail walks through that case in detail.

Where AI is actively dangerous in customer service

Edge cases with legal or financial exposure. AI invents policies that don’t exist. Air Canada’s chatbot promised a refund policy the company hadn’t authorized, and the court ruled the company was bound by it. The “the chatbot was wrong” defense did not work.

Anywhere your AI is making promises to customers that can become legally binding, you need a human in the loop. Anywhere it’s making promises that can become financial commitments, you need a human in the loop. Anywhere it’s making statements about your products, services, or policies that could later become evidence in a dispute, you need a human in the loop.

The full mechanism of why this happens is in The Edge Case Is Where Your AI Project Dies. The short version is that AI invents plausible-sounding answers to questions it doesn’t have answers to, with full confidence and no signal to the customer that the system is making it up.

The deployment checklist

If you’re responsible for an AI customer service rollout, run through this before signing anything.

  • The AI deflects routine tickets only. Define “routine” before deployment, not by what the AI confidently answers, but by what the company is willing to bind itself to.
  • The AI never makes a commitment the company has not pre-approved. Refund policies, exception handling, escalation paths, all hard-coded, not generated.
  • Every customer can reach a human in two clicks or less. No phone trees that loop. No required interaction with the AI before a human is offered. No five-minute uninterruptable speech.
  • The AI hands off to a human for any ticket flagged as angry, escalated, or complex. Keep the human team sized for the 30 percent, not laid off based on the 70 percent.
  • The AI assists complex resolution by drafting responses and surfacing context, but does not send anything to a customer without a human review on any non-routine ticket.
  • Customer satisfaction is measured separately for AI-handled and human-handled tickets, and the company tracks both numbers monthly. If AI-handled CSAT drops below your target, the routing logic gets reviewed immediately.
  • Every AI response is logged with full context. The same logs are reviewed by humans on a sampling basis to catch hallucinated policies, invented commitments, or pattern errors before they become a class.
  • The human team gets sharper, not smaller. The AI handles the burnout-inducing volume so the humans can handle the high-value, high-judgment work, all the time, with appropriate compensation for the harder mix.

The pattern

AI in customer service wins when it handles the routine and humans handle the relationship. AI loses when executives confuse “AI handles 70 percent of tickets” with “AI can replace the team.”

The companies running the second playbook are providing material for Bloomberg headlines about 18 months from now. The companies running the first playbook are quietly winning while their competitors learn the hard way.

Customer service is the sector where this lesson is going to be the most public, the most expensive, and the most easily avoided. Every executive making a deployment decision right now is choosing which side of that headline they’ll be on.

Frequently Asked Questions

What kinds of customer service tickets is AI actually good at?
Routine resolution. Order status, return processing, password resets, billing questions about straightforward charges, store hours, product information already in the documentation. Pattern-based, bounded, low-stakes inquiries that make up the bulk of inbound volume. AI is faster, more available, and more consistent than a human team on this work, and a well-designed deflection system can handle 60 to 70 percent of total ticket volume without firing anyone.
Where does AI break in customer service?
Relationship repair. The angry customer, the customer with a legitimate complaint, the customer whose relationship with the brand is on fire. None of these get fixed by a chatbot, and the customer knows it. Every failed AI customer service deployment has been a failure on this 30 percent of tickets, not the 70 percent the AI handled fine. The customers whose relationships matter most get the worst service the company has ever provided.
Is it legal for an AI to make promises to customers?
The promises can become legally binding regardless of whether a human authorized them. Air Canada’s chatbot invented a refund policy and a court ruled the company was bound by it. The “the chatbot was wrong” defense did not work. Anywhere your AI is making statements that could become commitments, you need a human in the loop, or you need to hard-code the boundaries the AI cannot cross, or both.
How fast should customers be able to reach a human?
Two clicks or fewer. No phone trees that loop. No required interaction with the AI before a human is offered. No uninterruptable speech that runs for five minutes before the customer can do anything else. If a customer wants a human, the system gives them one, fast. Every minute of friction is a customer who learns that the company doesn’t want to talk to them.
Why do AI customer service rollouts keep failing publicly?
Because customer service is the most visible category in any company. Every interaction is a moment where the company’s promise meets the customer’s actual problem. Bad interactions propagate fast to reviews, social media, and the customer’s network. The 30 percent of tickets the AI can’t handle properly is exactly the 30 percent whose customers are most likely to be vocal about the experience. The failures show up in public faster than any other category.
What’s the right way to size the human team after deploying AI?
Size it for the hard 30 percent of tickets, not the 70 percent the AI deflects. The human team gets sharper, not smaller. The AI handles the burnout-inducing volume so the humans can spend their entire workday on the high-value, high-judgment work. Compensate the team appropriately for the harder mix, and ticket throughput per agent goes up while customer satisfaction stays where it was or improves. That’s the augmented model. The replace model is the one producing Bloomberg headlines about reversals.


📝 Disclaimer

The views and opinions expressed in this blog post are solely those of Richard Lowe and are based on personal experience and research. This content is for informational purposes only and should not be construed as professional legal, financial, accounting, or business advice. Always consult with qualified professionals before making important business or legal decisions. Richard Lowe is not a lawyer, accountant, or licensed professional advisor, and this content does not establish any professional relationship.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.