Thursday, September 28, 2023

"Difference between night and day": Adventures in root-cause analysis

Last week we talked about a specific problem-solving discipline. Here's another case of problem-solving, involving an elegant example of root-cause analysis. 

A while ago I stumbled across a video that gives a perfect example of good, pragmatic root-cause analysis. The National Parks Service found that the stone of the Jefferson Memorial in Washington, D.C., was deteriorating. The analysis of what was causing the problem turned out to be a classic example of 5-Why methodology. The video is less than two minutes long:


Note that this is a perfect example of how 5-Why is supposed to work: you keep asking "Why?" until you get to a point where you can find a pragmatic solution ... and where you can't go any farther without giving up on pragmatism altogether.

  • Trying to stop the birds without stopping the spiders would have been hopeless.
  • Trying to stop the spiders without stopping the midges would have been hopeless.
  • On the other hand, nobody asked "Why do midges come out at dusk?" That's just a fact, and there's nothing we can do about it. (I've said before that if you start asking "Why?" about fundamental facts of nature, you've gone too far. Real root causes must be actionable.)

In that whole chain of causality, there was one sweet spot, flanked by impossibilities on both sides. That's where the solution lay.  

Not all root-cause analyses are this elegant, but this is certainly the goal to aspire to.

           

Thursday, September 21, 2023

Problem-solving with the Toyota Kata

A couple evenings ago, I attended a talk by Leigh Ann Schildmeier on the Toyota Kata process. Schildmeier is an engaging speaker; her material was interesting, and her presentation was disarmingly relaxed. All the same, she covered far more territory than I can address here. But I'd like to gesture towards the basics. If you want to know more, by all means contact her or check out one of her websites: Park Avenue Solutions or Starter Kata.

In a sense, Kata is a problem-solving methodology; like a number of others, it can be characterized as "the scientific method applied to business." What is distinctive, though, is that Kata (when properly applied) is a kind of discipline that permeates a company's decision-making. It's not just one more tool that sits on the shelf until you pull it down for a crisis. This distinction becomes clear when you look at the root meaning of the word: the term kata comes from martial arts, and it means "a detailed choreographed pattern ... made to be practised alone."* In the same way, the Toyota Kata is a pattern of thought and action that lets you pick your way through unknown territory; and ideally it should be so routine that you use it reflexively—on production problems, on management problems, or on finding your way in a new city when your GPS is broken. (Schildmeier says she has used Kata to improve her golf game.)

As an aside: Before I go too far, I should clarify that Schildmeier did not invent this system, though she teaches it, coaches it, and consults on it. The system was invented by Toyota (hence the name), and it was first publicized in the United States by Mike Rother in a 2009 book called Toyota Kata

There are four basic steps to the Kata process, and in the abstract they sound very simple. 

  1. Understand the overall challenge you face.
  2. Grasp where you are right now.
  3. Set a target for your next step towards the challenge.
  4. Run experiments on how to get there.

To illustrate what these mean, Schildmeier asked us to think of a football** game.

  1. The overall challenge is to win the game.
  2. "Where you are right now" might be, let's say, the 50-yard line. But it also includes understanding the weather, the status of the game so far, your own current condition and that of the other team. There might be other relevant factors, as well.
  3. Your next "target condition" is where you want to get to next, in the service of the final goal. But it has to be concrete and achievable. In a football game, that might be to advance the ball ten yards. (In business, Schildmeier says your next "target condition" should be something you can achieve in no more than two weeks.)
  4. And then you run experiments on how to get there. Wait, what? Yes, says Schildmeier, that's exactly what a team does. They have already been running plays hour after hour in practice, for weeks before the game ever started. But they don't know until they actually get on the field which of those plays is going to work. They have to observe the concrete conditions on the field and the behavior of the players on the other team in order to get an idea which plays are likely to succeed and which ones are doomed to fail. And then, based on that empirical input, the quarterback calls the play and the team carries out what is—in effect—an experiment. Whether it works or fails, either result constitutes additional information which the team (especially the quarterback, of course) uses to decide which plays to call next time.  

Simple, right? 😀 Of course, conceptually it is simple. As for practice, ... well, anything can look simple if you've spent ten thousand hours perfecting it. But that's exactly why Kata is supposed to be a discipline and not just a tool. Besides, it's not like problems are so rare that they only show up once a year. Maybe the big ones are rare (though not rare enough, I might add!). But smaller problems show up all the time. If you can perfect an approach that handles problems of any size, you prepare for addressing the big ones by solving small ones routinely.

Naturally there is more. There is a whole body of practice and coaching to help your organization get from here to there—to help you get to the point where you can use Kata reflexively and routinely. Check out Rother's book, ... or give Schildmeier a call. She might be willing to talk to you about your golf game, as well.  

__________

* Wikipedia, "Kata," retrieved 2023-09-20.

** I mean American football in this context.     

              

Thursday, September 14, 2023

Oops, my bad—tools CAN'T always calibrate each other!

I goofed last week. I said you can have two tools calibrate each other, and I didn't put any restrictions around that. I was wrong.

You remember the whole topic was whether two tools can calibrate each other. The question I asked was this: "Suppose you calibrate some of your own tools in-house, instead of sending them out. And suppose that when you calibrate Tool-1, you use Tool-2. Normally there's nothing wrong with that, so long as Tool-2 itself is also correctly calibrated. But back when you calibrated Tool-2, you did it using Tool-1. Is that a problem?"

I argued that you can do this, within the parameters of quality system standards like ISO 9001 and ISO 17025. And right away I got helpful feedback from commenters on LinkedIn telling me, "Not so fast!"

Christopher Paris pointed out a technical issue I had forgotten. When I described the calibration history of Tool-1 and Tool-2, I traced them both back to the day you bought them from the manufacturer. I assumed you got a certification from the manufacturer at that point. But Chris observed "that the original calibration certificate from the manufacturer is rarely traceable to national/international standards. It's typically some basic certificate that doesn't really provide much information. So tracking back to that doesn't get you full compliance to ISO 9001. If the OEM's cert doesn't list traceable standards used to calibrate the device, then the device still has to be subject to a third-party lab or some other traceable calibration."

So yes, I accept that correction. Tool-1 and Tool-2 both have to be calibrated at the beginning in a way that is traceable to the correct international standards.

But what about the part where you then use the tools to support each other? What about the way that they leapfrog one another on and on into the future forever?

Scott Kruger and John Schultz each flagged this as an improper use case, and we had some helpful discussions about pragmatic topics to clarify why. But I wanted chapter-and-verse. If this is a bad practice, it should be forbidden by the relevant standards—either that, or there's a hole in the standards and someone is going to exploit it.

I finally found it, but I had to dig. ISO 17025, clause 6.5.1, states: "The laboratory shall establish and maintain metrological traceability of its measurement results by means of a documented unbroken chain of calibrations, each contributing to the measurement uncertainty, linking them to an appropriate reference." [Emphasis mine.]

Let's apply this to my thought experiment in last week's post. Here's what I wrote then.

Remember that when you calibrate any tool, that measurement typically has a validity period. Maybe it's valid for one year. So let's say you calibrate Tool-1 every January, and that calibration is good from January to December. Then you calibrate Tool-2 every July, and that calibration is good from July to June.

Last month was July 2023. Time to calibrate Tool-2.

So you pull out Tool-1. Is it a valid tool to use? Check the sticker. It was calibrated in January 2023, and is good through December 2023. So it must be good to use.

But wait. Let's check the paperwork to make sure. According to the paperwork, when we calibrated it in January 2023 we used Tool-2. Hold on! Isn't that the same tool we're trying to check right now?

No. It's not.

The tool we are trying to check "right now" (meaning last month, when I've set this story) is "Tool-2-as-of-July-2023." The tool we used last winter back when we were calibrating Tool-1 was "Tool-2-as-of-January-2023." If you look at it right those should count as different tools,....

Stop right there.

What I should have seen is that as soon as I treat "Tool-2-as-of-July-2023" and "Tool-2-as-of-January-2023" as different tools, I've got a problem. What does Tool-2's unbroken chain of calibrations look like?

July 2023: Tool-2-as-of-July-2023 was calibrated by Tool-1-as-of-January-2023.

January 2023: Tool-1-as-of-January-2023 was calibrated by Tool-2-as-of-July-2022.

July 2022: Tool-2-as-of-July-2022 was calibrated by Tool-1-as-of-January-2022.

January 2022: Tool-1-as-of-January-2022 was calibrated by Tool-2-as-of-July-2021.

And so on.

Every single one of those measurements introduced an uncertainty. Maybe it was small, but it was there.

Just to make things simple, let's pretend the additional uncertainty is the same each time. (In real life, it might not be.) Call that uncertainty ε. [That's a Greek epsilon.] Then every year that you play this leapfrog game, the uncertainty for each tool increases by 2ε (adding one in January and one in July). If you've been using Tool-1 and Tool-2 to calibrate each other for ten years, you have added 20ε to the uncertainty of each one. 

When does that accumulated uncertainty become too much? At what point does it make the tool worthless?

It all depends what you normally use your tools for. How small an uncertainty do you need? Maybe if the tools start off a lot better than you actually need, you can get away with it for a while. But you must need some level of precision and accuracy, or you wouldn't bother calibrating your tools at all. And you probably didn't spend the extra money to get tools that were 100x more precise and accurate than you really needed. So you can't really play this game too long. Certainly not forever.

As I say, I goofed. I was wrong, and I'm grateful for the corrections. Thank you, all. 

           

Thursday, September 7, 2023

Tools that calibrate each other

Here's a question that puzzled me the first time I saw it: Can you have two tools, each of which is used to calibrate the other? 

More precisely: Suppose you calibrate some of your own tools in-house, instead of sending them out. And suppose that when you calibrate Tool-1, you use Tool-2. Normally there's nothing wrong with that, so long as Tool-2 itself is also correctly calibrated. But back when you calibrated Tool-2, you did it using Tool-1. Is that a problem?

The first time I saw this, it bothered me. It looked like, "I'll tell them you're an expert; and then if they want to know what gives me the standing to say so, you tell them I'm an expert. Since they already know that you're an expert (because I just said so) they should trust your judgement. Right?"



And in a static world, I might have had an argument. But I forgot about time

Remember that when you calibrate any tool, that measurement typically has a validity period. Maybe it's valid for one year. So let's say you calibrate Tool-1 every January, and that calibration is good from January to December. Then you calibrate Tool-2 every July, and that calibration is good from July to June.

Last month was July 2023. Time to calibrate Tool-2.

So you pull out Tool-1. Is it a valid tool to use? Check the sticker. It was calibrated in January 2023, and is good through December 2023. So it must be good to use.

But wait. Let's check the paperwork to make sure. According to the paperwork, when we calibrated it in January 2023 we used Tool-2. Hold on! Isn't that the same tool we're trying to check right now?

No. It's not.

The tool we are trying to check "right now" (meaning last month, when I've set this story) is "Tool-2-as-of-July-2023." The tool we used last winter back when we were calibrating Tool-1 was "Tool-2-as-of-January-2023." If you look at it right those should count as different tools, because tools drift over time. We all know that tools drift. That's why calibration has to be repeated. That's why there is a validity period in the first place!

Sure enough, if you keep following the paperwork back in time, you'll find the two tools leapfrogging each other. Each time you use this one to calibrate that one, the one you are using turns out to be a legitimately calibrated tool because it's only six months into its 12 month validity period. And when you trace them far enough back in time you find the day that each tool was purchased from the manufacturer, at which point it came with some kind of certification and guarantee.

It took me a while to figure it out, but I'm sure this is the answer.

          

Five laws of administration

It's the last week of the year, so let's end on a light note. Here are five general principles that I've picked up from working ...