There is a story from IBM’s earliest days that I have told from more stages than I can count. An employee made a mistake. The version I know puts the cost of that mistake at $600,000. Naturally, when it was discovered, the employee was fully prepared to be sacked. But IBM founder Thomas Watson saw it differently. “I just spent $600,000 educating the man,” he said. “Why should another company benefit from that investment?”

Most leaders hear that story and nod. Some quote it on stage. Almost none of them actually run their companies that way. And the reason is that the “fail fast, fail often” mantra has been so badly cheapened over the last decade that it now means almost nothing. We celebrate failure without distinguishing the kinds. We tell teams to take risks without giving them a usable filter for which risks are worth taking. We post motivational quotes about failure on slack and then quietly punish the next person who actually fails in front of us.

The $600,000 question is not whether you fired the employee. It is whether you can tell the difference between a mistake worth $600,000 of education and one worth $600,000 of preventable embarrassment. Amy Edmondson, the Harvard Business School professor who pioneered the field of psychological safety, has spent four decades answering that question. In her 2023 book Right Kind of Wrong: The Science of Failing Well, she gives leaders the framework most “fail forward” content is missing. I want to combine her framework with the one I developed at WD-40 Company, because together they give engineering and product leaders something usable on a Monday morning.

An engineering team examining a complex diagram together, representing thoughtful experimentation and learning

TLDR

  • “Fail fast” without a filter is just expensive theatre. Most leaders cannot tell the difference between a failure that builds the company and one that drains it.
  • Amy Edmondson’s research identifies three types of failure: basic, complex, and intelligent. Only one of them deserves celebration.
  • I run a parallel filter I call Learning Moments. A Learning Moment in an area of experiment is gold. A Learning Moment in an area of demonstrated competence is wasted tuition.
  • Together, the two frameworks give leaders a practical filter: was this a failure worth paying for, or was it preventable repetition we already knew better than?
  • The $600,000 story works only if you know which kind of $600,000 you are spending. Otherwise you are not building a learning culture. You are building an expensive one.

What Did Thomas Watson Actually Mean by the $600,000 Story?

Thomas Watson was not saying every mistake is sacred. He was saying that experience purchased through a high-stakes attempt is too valuable to walk out the door. The story is told most often as a parable about psychological safety, but it is really a story about return on tuition. Watson was protecting the investment, not the employee.

That distinction matters because most retellings of the story strip the nuance out. They turn it into a slogan, which is precisely how “fail fast” became the cliché it is today. The original Watson quote is more careful than its modern descendants: “Why would I want somebody to hire his experience?” The employee had been educated by the mistake. The expensive part had already been paid. Firing him would have transferred that learning to a competitor.

But here is the part that is rarely quoted. Watson did not say he wanted more $600,000 mistakes. He said this particular mistake had already produced its tuition value. The next $600,000 mistake of the same kind would not be tuition. It would be neglect.

That is where Edmondson’s framework arrives to do the work the slogan cannot.

What Are the Three Types of Failure Engineering Leaders Need to Distinguish?

Edmondson’s Right Kind of Wrong breaks failure into three categories, and only one of them is the kind any healthy team should be producing more of. Basic failures happen in known territory where the knowledge already exists. Complex failures happen in familiar systems where multiple factors interact in unexpected ways. Intelligent failures happen in new territory, where the only way to learn what works is to try.

Basic failures are the ones engineering teams should be working to eliminate. They are the production outages caused by skipping the runbook. The missed deadline because someone forgot to renew the certificate. The customer escalation because no one read the change log. These are failures in areas of demonstrated competence. They cost time and trust. They do not produce learning that justifies their cost, because the knowledge to prevent them already existed somewhere in the organization. It just was not used.

Complex failures sit in the middle. The 2024 CrowdStrike outage, which took down millions of Windows machines globally, was a complex failure. Multiple factors aligned in a way no single person predicted. Those failures require systems thinking and process redesign rather than blame.

Intelligent failures are the kind Watson was protecting. They are the result of a thoughtful experiment in new territory where the answer could not be known without running the test. The five hundred attempts at the WD-40 formula before the company got to the fortieth one that worked. The early product launches in markets where the customer behavior was genuinely unknown. The architectural bet on a new approach that failed in production but taught the team what would work in the next attempt. These failures pay tuition. The other two waste it.

“We used to think of failure as the opposite of success. Now, we’re often torn between two failure cultures: one that says to avoid failure at all costs, the other that says fail fast, fail often. Both approaches lack the crucial distinctions to help us separate good failure from bad.”

That is the trap. And it is the trap most engineering and product organizations are sitting in right now.

How Does the Learning Moment Framework Filter the Right Kind of Wrong?

At WD-40 Company, we replaced the word failure with the phrase Learning Moment. The reframe was not cosmetic. It was a forcing function. Every time something went wrong, we asked three questions. What happened? What did we learn? What will we do differently next time? Notice what is missing. Blame.

But the framework also includes a filter most “Learning Moment” descriptions leave out. I distinguish between two kinds of Learning Moments based on the territory they happen in.

The more acceptable kind is a Learning Moment in an area of experiment. We did not know what would happen. We had a thoughtful hypothesis. We ran the smallest test we could to learn something we could not have learned otherwise. The result was unexpected. We share what we learned, the whole tribe benefits, and we move on smarter. That is tuition well spent.

The less acceptable kind is a Learning Moment in an area of demonstrated competence. Someone repeated a mistake the organization had already solved. They did not consult the runbook. They did not ask the colleague who had handled this last quarter. They did not check the post-mortem from the same incident six months ago. In that case, the response is not blame, but the response is also not celebration. It is a quiet conversation about why our existing knowledge did not travel to the place it was needed.

That second category is where most engineering teams quietly hemorrhage tuition. The same incident in a different shape. The same security oversight in a different service. The same coordination failure in a different launch. None of it is new. All of it gets framed as a Learning Moment in the postmortem because that is what the culture deck says. And so the organization pays for the same education over and over while telling itself it has a learning culture.

You do not. You have an expensive culture.

Why Does Psychological Safety Only Work When Paired with This Filter?

Psychological safety without a filter for failure types produces the worst of both worlds: teams that feel safe doing the same wrong things repeatedly, with no learning velocity to show for it. Edmondson’s own work on psychological safety makes this point clearly. Safety is the precondition for honest discussion of mistakes, not a substitute for the discussion itself.

A team gathered around a whiteboard discussing a post-mortem, demonstrating open conversation about mistakes

When I built the Learning Moment culture at WD-40 Company, the first thing people noticed was that engagement went up. We hit 93 percent employee engagement in a global organization, with 98 percent of employees proud to say where they worked. Those numbers do not happen without psychological safety. But they also do not happen without standards. Gallup’s State of the Global Workplace 2024 report shows that the global average sits at 21 percent engaged. The gap between 21 percent and 93 percent is not motivation. It is method.

The shortcut version of “fail fast” misses this. It tells teams to take risks without teaching them which risks are worth taking. It tells leaders to celebrate failure without teaching them which failures to celebrate. The result is the worst of both worlds: a team that feels permission to be sloppy, paired with a leader who quietly resents paying for the same mistakes twice.

Edmondson’s framework solves this by giving leaders explicit language for the categories. Basic failures are caught and corrected. Complex failures trigger systems redesign. Intelligent failures are celebrated, documented, and shared. The categories are not soft. They are operationally useful. Harvard Business School research confirms that leader vulnerability is the strongest predictor of whether team members will surface their own mistakes in the first place.

“Make your own mistakes public, no matter how embarrassing they might be. Your people need to see you out on that proverbial limb alongside them as everyone takes the initial risky steps together.”

That is what Watson did with the $600,000 story. He went first. He named what happened. He took the cost. And he made it permissible for every IBM employee thereafter to risk something the company genuinely needed risked, without the fear of being made an example of.

How Should Engineering and Product Leaders Apply This in Practice?

Engineering and product leaders can apply the combined framework with three practical moves. Each one tightens the filter without crushing the appetite for genuine experimentation.

First, when something goes wrong, classify before you respond. Was this basic, complex, or intelligent? A basic failure gets a runbook fix and a knowledge transfer review, not a blame ritual but not a celebration either. A complex failure gets a blameless post-mortem and a systems-level redesign. An intelligent failure gets shared across the organization as quickly and broadly as you can manage.

Second, ask the territory question explicitly. Before greenlighting an experiment, ask whether you are in known territory or new territory. If known, you are not running an experiment. You are running an execution task, and the cost of failure is on you, not on the tester. If new, define the smallest test that produces the most learning per dollar at risk. Edmondson is firm on this point. Intelligent failure has to be as small as it can be while still producing the insight.

Third, build the public-failure muscle in your own behavior first. The fastest way to kill a learning culture is to have the leader hide their own mistakes while telling the team to share theirs. The IBM employee who walked into Watson’s office expecting to be fired was the one taking the risk. The leader who walked out of that office having absorbed a $600,000 lesson was the one teaching the rest of the company what risk-taking actually cost, and what it actually produced. Both sides matter.

Frequently Asked Questions

What is the $600,000 story actually about?

The story is about Thomas Watson, founder of IBM, who refused to fire an employee whose mistake had cost the company $600,000. Watson’s reasoning was that the company had just spent $600,000 educating that employee, and firing him would transfer the learning to a competitor. The story is most useful when read as a parable about return on tuition, not as a blanket endorsement of failure.

What are Amy Edmondson’s three types of failure?

Amy Edmondson, in Right Kind of Wrong, identifies basic failures, which occur in known territory with existing knowledge; complex failures, which involve multiple interacting causes in familiar settings; and intelligent failures, which are thoughtful experiments in genuinely new territory. Only intelligent failures deserve celebration. The other two require prevention and systems redesign.

How is a “Learning Moment” different from a generic failure?

A Learning Moment is the open and honest sharing of a positive or negative outcome so the whole organization benefits. The framework I built at WD-40 Company distinguishes two kinds: Learning Moments in areas of experiment, which produce real learning, and Learning Moments in areas of demonstrated competence, which usually indicate the organization failed to transfer existing knowledge to where it was needed.

Why is “fail fast” considered a hollow slogan?

Fail fast became hollow because it skipped the filter. It told teams to celebrate failure without teaching them which failures were worth producing. The result was a generation of organizations that paid for the same basic mistakes repeatedly while telling themselves they had a learning culture. Edmondson’s research and the Learning Moment framework both restore the missing distinction.

Can psychological safety exist without high standards?

No. Psychological safety is the precondition for honest discussion of mistakes, not a substitute for accountability. The leaders I admire describe this balance as a heart of gold and a backbone of steel. Without the heart, standards become cruelty. Without the backbone, safety becomes permission to underperform.

How do engineering teams typically misuse the IBM story?

Engineering teams misuse the story when they quote it to justify the next preventable incident. The Watson story works only when the mistake produced learning that could not have been purchased any other way. Repeating the same outage three quarters in a row is not a $600,000 education. It is a $600,000 receipt for ignoring the lesson the first two times.

Stop Celebrating Failure. Start Filtering It.

The leaders I admire most do not have lower failure rates than their peers. They have a sharper filter. They know which failures to invest in, which to prevent, and which to share across the organization so the whole tribe benefits. They treat mistakes the way Thomas Watson treated the $600,000 mistake: as a transaction with a cost and a return, where the leader’s job is to make sure the return actually arrives.

The next time someone on your team comes to you with a failure, the question is not whether to celebrate it. The question is which kind it was. If it was a thoughtful experiment in new territory, you owe them the IBM story and a public thank you. If it was a repeat of something the organization already knew better than, you owe them a Learning Moment conversation about why our knowledge did not travel.

That is the $600,000 question. And the leaders who can answer it are the ones building cultures that actually compound.

If you want to go deeper on building a culture where Learning Moments produce real learning rather than expensive theatre, visit The Learning Moment to explore the frameworks, tools, and coaching that help leaders make this shift in their own organizations.

READY TO GET STARTED?

Get in touch

I’d love to help you build a culture where people feel connected, do meaningful work, and bring out the best in each other. Whether you’re looking for a keynote that sparks change or one-on-one coaching, let’s talk about what’s best to build your tribe.

What are you interested in?