Why Digital Transformations Actually Fail

This entry is part 13 of 14 in the series Technology

TL;DR

Most digital transformations fail for an unglamorous reason: poor testing. Everyone tests whether the system works normally. Almost nobody tests what happens under real load, or what happens when things break. I watched systems go live with untuned database queries that only showed their problems once thousands of people were on them. Load testing and failure-mode testing are where transformations die. People resistance matters too, and it tracks exactly with whether someone thinks the change threatens their job.

When a digital transformation fails, the post-mortem usually blames strategy or vision or leadership. In my experience, the real cause is almost always something far more boring. The testing was not good enough.

I led transformations at a national retailer for two decades, and now I ghostwrite books for the leaders who run them. The failure point was rarely the grand plan. It was the unsexy work of testing that nobody wanted to fund.

Load testing is where systems die

Here is the pattern I saw over and over. We would build a new system. The programmers tested their pieces and did a pretty good job. We tested our pieces and did a pretty good job. Everything worked in testing. Then we went live, thousands of real people logged on, and the system fell over.

The culprit was usually load. We had Oracle queries that were not tuned, and they were fine with a handful of test users. Under real load, with the full weight of everyone using the system at once, those same queries crawled. Here is why this happens and why it is so hard to catch: a database query that scans a little data runs instantly when ten people use it, but the same query against the full production dataset, hit by thousands of users at the same moment, can grind to a halt. The behavior is not linear. A system that is perfectly snappy with test traffic can collapse completely under real traffic, and there is often no warning until you cross the threshold. We ended up tuning those queries after go-live instead of before, because there was no good way to replicate the real load of a big system with a lot of people on it beforehand.

Your new system works perfectly in testing. Then a thousand real users log on at once and it falls over. That’s load testing, and almost nobody does enough of it.
Share on X

That is the lesson I would press on anyone running a transformation. Do far more load testing than you think you need. Simulate the real volume, the real number of concurrent users, the real size of the production data, before you go live, not after. The system that passes every functional test can still die the moment real volume hits it, and finding that out in production, with the whole company watching the new system fail on day one, is the worst possible time and the worst possible way. This connects directly to the component-by-component approach I describe in digital transformation is plumbing: when you move one piece at a time, each piece meets real load on its own, so a load problem shows up small and contained instead of taking down everything at once.

Nobody tests for failure

The other weak spot was failure-mode testing. We tested how the system worked when everything went right. We rarely tested what happened when something went wrong. What breaks when a server drops? When a connection times out? When a piece of data is malformed, or a disk fills up, or a dependency the system relies on is suddenly unavailable? Those paths went largely untested, because testing the normal path felt like enough, and the normal path is what everyone naturally thinks to check.

It is not enough. Systems spend plenty of time not in the normal state. Things fail constantly in real operation, hardware dies, networks hiccup, data arrives malformed, and the moments your system hits those conditions are exactly the moments you most need it to fail gracefully instead of catastrophically. A system that handles the happy path beautifully and then corrupts data or crashes hard the first time it meets a malformed record is a system that was never really finished. Testing only the happy path is testing for a world that does not exist. The same lesson shows up in security, where, as I describe in the boring security work that keeps you safe, the backup you never tested is the one that fails when you finally need it. Untested failure handling is the same trap in a different costume.

The people who resist, and why

Testing is the technical failure point. People are the human one, and their resistance was more predictable than you might think. It tracked almost perfectly with one thing: whether the person believed the change threatened their job.

Someone who thought the new system put their position at risk resisted hard. They dragged their feet, found problems, declined to learn the new way, because every step toward the new system felt like a step toward their own replacement. Someone who saw that we were training them, investing in them, keeping them employed, welcomed the change and often became its strongest advocate. Same transformation, opposite reactions, and the difference was entirely about what the person thought it meant for them, not about the technology at all.

People don’t resist change. They resist losing their job. Show them you’re training them, not replacing them, and the resistance mostly disappears.
Share on X

That taught me that managing the human side of transformation is mostly about addressing the fear honestly. People are not irrational about change. They are protecting themselves, which is completely rational, and if you can show them they are safe, that the change comes with training and a place for them in the new world, most of the resistance evaporates. This is exactly why the order of operations matters so much, why you start with people before you touch the technology, which I lay out fully in people, process, technology.

The failures are preventable

None of these failure modes are mysterious. Weak load testing, no failure-mode testing, and unaddressed job fear are all preventable with effort that nobody wants to spend, because it is boring and it delays the exciting launch. That is the pattern across every failed transformation I have seen: the work that would have prevented the failure was known, available, and skipped because it was tedious and slowed things down.

When I ghostwrite a transformation book, the failures are often the most valuable material, because the executive learned more from what broke than from what worked. The honest account of why a system fell over, and what they would test differently next time, is worth more to a reader than any success story. You can see how I work with technology leaders on the technology ghostwriting page.

Frequently Asked Questions

Get the Free Guides

Join the list and get my condensed books, free. No spam, unsubscribe anytime.

By subscribing you agree to receive occasional emails. Unsubscribe anytime.

Why do digital transformations fail?

Usually poor testing, not bad strategy. Teams test whether a system works normally and skip two critical things: load testing, which reveals how it behaves under real volume, and failure-mode testing, which reveals what happens when things break. Systems that pass every functional test still die when thousands of real users hit them at once.

What is load testing and why does it matter?

It is testing how a system behaves under realistic volume, not just a handful of test users. Database performance is not linear: a query that is instant for ten users can collapse under thousands hitting full production data. If you do not load test thoroughly with real volume, you discover the problem in production on day one, the worst possible time.

What is failure-mode testing?

Testing what happens when something goes wrong, not just when everything works. What breaks when a server drops, a connection times out, a disk fills, or data arrives malformed? These paths often go untested because the normal path feels like enough. It is not, because systems fail constantly in real operation and need to fail gracefully.

Why do people resist digital transformation?

Almost always because they think it threatens their job. Resistance tracks that fear precisely. Someone who believes the change puts their position at risk fights it; someone who sees they are being trained and kept on welcomes it and often champions it. Address the fear honestly, with training and a place in the new world, and most resistance disappears.

How do you prevent a transformation from failing?

Spend the boring effort nobody wants to: load test with real volume before going live, test the failure paths and not just the happy path, and address people’s job fear directly with training and reassurance. Moving one component at a time also contains failures, so a problem shows up small instead of taking down everything at once.

Do you ghostwrite books on digital transformation?

Yes. I led transformations for two decades and ghostwrote three on the subject. The failures are often the most valuable material, because leaders learn more from what broke than from what worked. You can see how I work on the technology ghostwriting page.


📝 Disclaimer

The views and opinions expressed in this blog post are solely those of Richard Lowe and are based on personal experience and research. This content is for informational purposes only and should not be construed as professional legal, financial, accounting, or business advice. Always consult with qualified professionals before making important business or legal decisions. Richard Lowe is not a lawyer, accountant, or licensed professional advisor, and this content does not establish any professional relationship.

Leave a Reply

Your email address will not be published. Required fields are marked *