Thursday, December 19, 2024

"What gets measured gets managed"—like it or not!

For the past couple of weeks we've been talking about metrics, and it is clear that they are central to most modern Quality systems. ISO 9000:2015 identifies "Evidence-based decision making" as a fundamental Quality management principle, stating (in clause 2.3.6.1): "Decisions based on the analysis and evaluation of data and information are more likely to produce desired results." ISO 9001:2015 (in clause 6.2.1) requires organizations to establish measurable quality objectives—that is, metrics—in order to monitor how well they are doing. We've all heard the slogan, "What gets measured, gets managed."

If you think about it, the centrality of quantitative metrics relies on a number of fundamental assumptions:

  • We assume that quantitative metrics are objective—in the sense that they are unbiased. This lack of bias makes them better than mere opinions.
  • We also assume that quantitative metrics are real, external, independent features of the thing we want to understand. This external independence makes them reliable as a basis for decisions. 
  • And finally, we assume that quantitative metrics are meaningful: if the numbers are trending up (or down), that tells us something about what action we need to take next.

But each of these assumptions is weak.

  • Metrics are not necessarily unbiased. In fact, as we discussed last week, there is a sense in which every quantitative metric conceals some hidden bias. Since this is true for all metrics, the answer is not to replace your old metric with a better one. What is important is to understand the hidden bias, to correct for it when you interpret your results. 
  • Metrics are not necessarily external or independent of the thing being measured. Think about measuring people. If they come to understand that you are using a metric as a target—maybe they get a bonus if the operational KPIs are all green next quarter—people will use their creativity to make certain that the KPIs are all green regardless of the real state of things. (See also this post here.)
  • And metrics can only be meaningful in a defined context. Without the context, they are just free-floating numbers, no more helpful than a will o' the wisp. 

We discussed the first risk last week. I'll discuss the second risk in this post. And I'll discuss the third one next week. 

Unhelpful optimization

I quoted above the slogan, "What gets measured, gets managed." But just a week ago, Nuno Reis of the University of Uncertainty pointed out in a LinkedIn post that this slogan is misleading, and that it was originally coined as a warning rather than an exhortation. Specifically, Reis writes:

It started with V. F. Ridgway’s 1956 quote: "What gets measured gets managed."

Yet, Ridgway was WARNING how metrics distort and damage organizations.

The FULL quote is:

"What gets measured gets managed—even when it's pointless to measure and manage it, and even if it harms the purpose of the organization to do so."*

The original source was a 1956 article by V. F. Ridgway called. "Dysfunctional consequences of performance measurements."** Ridgway's point is that a metric provides just a single view onto the thing you want to understand, but some people will always treat it uncritically, as the whole truth. This misunderstanding creates an opportunity for other people to exploit the metric by acting so that the numbers get better, even if the overall organization suffers for it. Examples include the following:***

"1. A case where public employment interviewers were evaluated based on the number of interviews. This caused the interviewers to conduct fast interviews, but very few job applicants were placed.

"2. A situation where investigators in a law enforcement agency were given a quota of eight cases per month. At the end of the month investigators picked easy fast cases to meet their quota. Some more urgent, but more difficult cases were delayed or ignored.

"3. A manufacturing example similar to the above situation where a production quota caused managers to work on all the easy orders towards the end of the month, ignoring the sequence in which the orders were received.

"4. Another case involved emphasis on setting monthly production records. This caused production managers to neglect repairs and maintenance.

"5. Standard costing is mentioned as a frequent source of problems where managers are motivated to spend a considerable amount of time and energy debating about how indirect cost should be allocated and attempting to explain the differences between the actual and standard costs."

You see the general point. In each case, a metric is defined in the hopes that it will drive organizational behavior in a good direction. But the people working inside the organization naturally want to score as well as possible, preferably without too much effort. So they use their creativity to find ways to boost the numbers.

Also, in case this discussion sounds familiar, we have seen these themes before. Once was in 2021, in this post here, where I argue that "There is no metric in the world that cannot be gamed." But the exact same point shows up in this post here from 2023, about systems thinking—where the fundamental insight is that if you design your operations and metrics in a lazy way, without thinking through what you are doing, you will incentivize your people to deliver bad service

Pro tip: Don't do that. 

Goodhart's law

Let me wrap up by referencing the webcomic xkcd. This one is about Goodhart's Law, that "When a measure becomes a target, it ceases to be a good measure." Of course the reasons behind Goodhart's Law are everything I've already said in this post. Here's what xkcd does with it:****


Meanwhile, I hope everyone has a great holiday season! I'll be back in a week to talk about the third assumption we make regarding metrics.

__________

* It seems that this formulation is from a summary of Ridgway's work by the journalist Simon Caulkin. See this article for references.   

** Ridgway, V. F. 1956. Dysfunctional consequences of performance measurements. Administrative Science Quarterly 1(2): 240-247. See reprint available here, or summary available here. 

*** These five examples are quoted from this summary here, by James R. Martin, Ph.D., CMA.   

**** The xkcd website makes the following statement about permissions for re-use: "This work is licensed under a Creative Commons Attribution-NonCommercial 2.5 LicenseThis means you're free to copy and share these comics (but not to sell them). More details."     

      

Thursday, December 12, 2024

The hidden bias inside metrics

Last week we talked about metrics, and about how—if you find you need to measure something where no metric has ever been established before—you can just make one up. Of course this is true, but you still have to be careful. Make sure you understand what you want the metric to tell you. The reason is that sometimes you can measure the same thing in two different ways, and each way conveys a hidden message or bias.

For example, suppose you are comparing different ways to travel from one place to another: walking, skateboarding, bicycling, driving, flying. And suppose you want to know which is the safest. How do you measure that?

It all depends which one you want to win. If you work for the airline industry, then you probably want to convince people that commercial air travel is the safest form of travel. That way, more people will choose to fly, and your business will grow. So in that case, you measure safety in terms of Number of fatal accidents per mile traveled

It's a simple fact that commercial air travel has very few fatal accidents, so the numerator of that fraction will be very small. At the same time, flying is most practical when you want to cover long distances, so on the whole the denominator is very large. That means that the overall fraction will be very small indeed, and—sure enough!—the airline industry regularly advertises that flying is the safest way to travel.

But you could equally well approach the question from another direction. Suppose you ask: If something goes wrong, how much danger am I in? Using this metric, flying no longer leads the pack. If something goes wrong while you are walking—even if you are walking long distances—you likely need no more than a day's rest and a better pair of shoes. But if the airplane that you are on develops catastrophic engine failure at 35,000 feet, the odds are strongly against anyone walking away from the experience.

This is what I mean by the "hidden bias" in a metric. Because metrics are (by definition) objective and (generally) quantitative, we tend to assume that they are unbiased. But when you try to measure "Which form of travel is the safest?" flying comes out as either the best or the worst, depending which metric you choose.

Nor can you ask, "Well which one is the right metric to settle the question?" There is no "right" metric. Both of these metrics answer part of the question about safety. The real problem is that the question about the "safest form of travel" is badly posed. What are you really asking for? Do you want to know about the frequency or likelihood of serious problems? In that case, flying is the safest. Do you want to know about the lethality of serious problems? In that case, flying is the most dangerous. Before you choose a metric, you have to understand very exactly what you want it to tell you. In the same way, before you blindly accept any metric quoted by somebody else, think hard about what that metric is really measuring, and about why he chose to use it and not a different one.

Years ago, I saw a customer advocate on television exploding a metric in the most delightful way. Some brand of potato chips had come out with a new line, that advertised "Less Salt and Less Oil!" But a close analysis of the production process showed that actually a bag of the new chips contained—overall—more salt and more oil than a bag of their regular line. How could they get away with advertising "Less Salt and Less Oil"? When he challenged them they explained that they had made the potato chips smaller! Therefore—so they said—if you sit down with a plan to eat exactly ten potato chips (or some other definite number), you end up consuming less salt and less oil than if you had eaten ten of their regular chips. And of course the consumer advocate riposted with what's obvious, namely, that nobody ever sits down to eat a specific number of potato chips. In fact, he said, the only time he had ever seen anyone count out a specific number of potato chips was when he saw two eight-year-old boys dividing a bag between them. Otherwise, that's not what people do. So the metric was true as far as it went, but it was misleading.  

The same thing is true of any other metric. Be it never so objective, it will silently push the conversation in one direction rather than another. When you choose a metric—or when you make one up, if you have to do that—make sure that it is pointing in a direction you want to go. 

      

Thursday, December 5, 2024

"How hot is that pepper?": Adventures in measurement

We all know that measurement is important. But what if you want to measure something that has no defined metric?

The answer may be that you have to make something up. Look at the feature, or process, or event that you have in mind; determine its salient characteristics; and then decide how those can be best isolated and communicated. Often, the clearest communication is quantitative, in terms of numbers. In a few cases, you might find it simpler to communicate in binary terms (on/off), or qualitatively. But in all events, make sure that your distinctions are objective and repeatable.

The basic elements you have to define are:

  • system of measurement
  • unit of measure
  • sensor

And that's it! Once you know how you are going to check the thing (sensor) and what you are going to count (unit of measure), you can measure whatever you need.

This video explains the process by walking through the steps to establish the Scoville scale, which measures the hotness of chili peppers. It's quick and fun and less than two minutes long.


 
 

 

Five laws of administration

It's the last week of the year, so let's end on a light note. Here are five general principles that I've picked up from working ...