Pragmatic Quality Blog: August 2023

Thursday, August 31, 2023

Accuracy and precision

Calibration is one of the basic methods in any Quality Management System, but for years my understanding of it was not deep. Of course I knew it was important. If you are making a product that requires precision measurements, but your measuring tools aren't calibrated, you have no guarantee that your measurements are right. Concretely, if you need a part to be 0.250"± 0.001" but your tool is off and the part is actually (let's say) 0.234" instead, it's not going to fit. So yes, it matters.

This is why clause 7.5.1 of ISO 9001:2015 requires that you figure out whether you need calibrated equipment; and then, if you do, that you calibrate it. And in all the years that I did internal audits, I made sure to turn over any measuring equipment that was actually in use to check the calibration stickers.

But then I got a chance to work for a calibration laboratory, and I began to appreciate the huge amount of mathematical theory that underlies the whole job of calibration. I didn't work there long enough to be able to regenerate all the calculations on my own from first principles. But I did learn some of the more important concepts.

One of these is the distinction between accuracy and precision. Those both sound like good things, and of course they are. But they are different things, and the exact nature of the difference matters.

Whenever you measure something, you get a reading of some kind. But you can never be sure that the reading is exactly right. In order to make sure that the reading is as good as possible, you want to ensure two different things:

On the one hand, you don't want your measuring tool to bounce around. You want it to give you the same answer every time you measure the same thing. I've got a scale in my kitchen that isn't very good at this. If I set a weight on it all at once (like a bag of onions) the needle bounces up to a certain reading. If I add the same weight gradually (for example, by pouring in rice until I get the right amount) the needle tends to stick on the way up; then I'll add a little more and it jumps up to a higher number. This scale is good enough to make dinner with, but I would never dream of using it in a production facility to build product.

In calibration terminology, my kitchen scale is not precise: I can put the same weight on it and get two different readings, depending whether I add the weight all at once or gradually. A precise scale would give the same reading for the same weight no matter what.

But precision is only half the battle. I remember a professor of mine once told about visiting the NIST lab that stored the nation's first official atomic clock, which the tour guide called "the most perfect clock in the world." And my prof saw that someone had set the hands to the wrong time. As an atomic clock, it was more precise than any other clock in the country just then. But on that afternoon it wasn't accurate, because the hands were set wrong. It was telling the wrong time, but it told that wrong time with unequaled precision.

In other words, precision means that you get the same measurement each time. Accuracy means that the measurement you get is correct. You need them both.

You can have accuracy without precision. That's like my kitchen scale: there's a lot of fluctuation in the readings, but when nothing is on the scale it reliably shows zero. It's not set wrong. There's just room for error when you weigh something on it.

And you can have precision without accuracy. That's like the atomic clock on the day my prof joined the tour.

A quick search on the Internet turns up dozens of pictures that all show the difference in basically the same way. Here's one, for example.

Or, if you prefer, Randall Munroe explained the difference this way in his webcomic xkcd.

Thursday, August 24, 2023

The process approach in daily life, or, "Use this one weird trick!"

We all know that the "process approach" is central to the modern Quality business. But is it good for anything else? Does it help you address situations that come up outside the office?

It's helped me. Let me tell you about it.

First, I have a question. How many of you have ever been frustrated by officious bureaucrats? I'm talking about when you are asking some agent in a large organization for something that is perfectly reasonable, and he or she just keeps repeating back, "I'm sorry, I can't do that. I'm sorry, that's not our policy." How many of you have experienced exactly this situation, where you are brought to a dead stop by someone who refuses to do the reasonable thing to help you out? Can I see a show of hands?

Thanks. That's what I thought.

And I've found a way to avoid being stopped, to move forward. All I have to do is to ask my question using different words. Instead of asking, "What's wrong with you? Why can't you just do X, which is the obvious way to handle my situation?" I rephrase the exact same request and ask:

"What's your procedure for doing X?"

The key was when I realized that all the people in this position—all the unhelpful agents, all the officious bureaucrats—are saying "No" because they are following a procedure. They are, in fact, probably following a script that they are required to stick to. So instead of yelling at them, or insulting their organizations, I go on like this:

"I know that your organization must have a procedure for this kind of situation, because obviously it's going to happen from time to time. I know I can't possibly be the only person who has ever needed this. So you have a procedure but I just don't know what it is, and I don't know whom to ask. I'm sure it's some other department, and I'm sorry that I'm taking your time when it's not your area, but I don't know where to go instead. Can you please connect me with the person who normally handles my kind of situation? Thank you very much."

This approach does two things. In the first place, since I'm not attacking, the agent doesn't feel defensive and is more likely to help. In the second place, I have basically promised to go away and become someone else's problem. This is a powerful motivator. Before you know it, the clerk is telling me, "You need to talk to Mrs. Ipswich about that. Here, I'll connect you."

And I'm on my way.

Thursday, August 17, 2023

Business continuity and cascading risks

Over the last two weeks, I've been talking about how to set up your business continuity planning. But there's an important step I haven't discussed yet.

Two weeks ago, I described a basic risk-handling protocol. Then last week, I described how you turn regular risk-handling into business continuity planning. But of course it’s complicated. Some risks can wipe you out; others are nuisance level; many are somewhere in between. Some risks require specialized expertise to address them, but their consequences are grave enough that you can’t just delegate them to the specialists and then forget about them. How do you focus?

The answer is that you have to distribute risk-handling throughout your organization so that risks are addressed by the right people, but in a way that always traces back to top management (who have, after all, final responsibility for the organization as a whole).

Last week, I said that each member of the Executive Team has to go back to his (or her) people to work out the details of that area’s approach. That step is the key to setting up a system of cascading risk management.

After all, even though business continuity affects everyone, there will always be some actions or some risks that are specific to a particular department or to a particular kind of disaster.

In case of fire or flood, your operations functions have to think about shutting down their work in a controlled way and getting to safety. Your warehouse function (if you have one) has to think about how to protect your inventory. But your office functions may be able to resume work remotely (once they have all gotten to safety), provided they still have access to your network.
In case of a pandemic, there should be no significant risk of physical destruction, but you may have more concerns around isolation or availability of personnel.

So yes, you have to start at the top. And the risks you track at the top level are the ones that can wipe you out. But then the members of your Executive Team go back to (let’s say) the middle managers who work for them, to do two things:

Figure out in detail how to implement the overall strategic approach. (All these steps should be traceable back up through the elements of the high-level strategy.)
Do an independent risk analysis at that level to see if there is anything special to their areas that was missed at the higher level. Use the same method I described two weeks ago—the very same method that the Executive Team already used.

And then the middle managers do the exact same thing again, engaging with their employees at the working level, to achieve the exact same two goals.

Naturally if (during one of these lower-level reviews) anyone discovers a risk that affects a wider group (or even the whole organization) but was accidentally missed, escalate it on up the management chain to where it belongs and then ask everyone to update their work to account for it.

In the end, every unit in your organization—every division, every department, every plant, every team—ends up doing some level of business continuity analysis, and tracking the measures that apply at their level. And every year, the whole organization repeats the analysis: to identify what’s changed and to check if all the defined measures are still correct and current.

Thursday, August 10, 2023

Business continuity starts at the top

Last week I started talking about business continuity and how to plan for it. As you remember, business continuity is the ability of a business to get back to work after something has disrupted it: hurricane, fire, flood, pandemic, or whatever. Business continuity planning is all the planning you do to prepare for disasters before they happen, so you can get back to work smoothly afterwards.

I said last week that business continuity planning is a part of risk management in general. Concretely, business continuity planning means identifying all the risks that could interrupt your business, or some part of it, and then taking action to mitigate those risks or planning contingency actions in case they take place. The basic approach is exactly what I described last week, but a few aspects are unique.

First, you have to start with your Executive Team. This is not a job you can delegate to the Safety Committee. The reason is that every unit inside the organization—every division, every department, every plant, every team—has to be engaged. They have to contribute to defining how to secure their work (because they know it better than anyone else); and they have to know what to do in case disaster strikes. So you start at the top.

Remember that you are looking for anything that can interrupt any aspect of your operations. This means you have to look not only at your direct operations, but at any support functions: billing, payroll, purchasing, and the rest. It also means you have to think about anything that can interrupt your customers or your supply chain. If you were unscathed by a disaster, but your main customers are out of commission and won’t be ordering for another year, you could have a problem. Likewise with your supply chain.

When you identify a risk (for example, "earthquake"), you cannot assign it to just one Owner. This is one of the big differences between business continuity planning and other kinds of risk management. If disaster strikes—hurricane, fire, flood, earthquake, or whatever it is—it’s going to strike everybody. So you can’t assign the whole problem to Fred or Max and ask him to figure out a solution for the entire company. Instead of that, each member of the Executive Team goes back to his people (or hers, of course) to determine how they have to secure their parts of the business. Depending on the size of your organization, some of them might have to go back to their people as well, to work out the details. Do whatever you have to do, but come back to the rest of the Executive Team with a plan for your area.

Then the Executive Team as a whole reviews the plans to make sure they are consistent. You can’t respond to a disaster by pulling in different directions: so while every team has to figure out what they specifically need, the plans still have to mesh together. As just a single example, if you are going to tell the office folks to work from home they should all be using the same communication platform to keep connected. If your team is the odd one out, you might have to change your plan a little to align with the rest of the company. Make sure you engage all the people you engaged before, so that everyone understands what’s changed.

Finally, when all the details have been worked out, document your plans in the simplest format possible, and store them somewhere that’s easy to find in an emergency. Remind people periodically where to look. (If you do regular fire or emergency drills, one of your steps has to be pulling a copy of the emergency plan.) And mark a day on the Executive Team’s calendar—six months out, or maybe twelve—to do the exercise again.

Thursday, August 3, 2023

Business continuity and risk planning

We've been talking about risk for a month now, and I'd like to talk now about a different kind of risk management: business continuity. Business continuity is the ability of a business to get back to work after something has interrupted it: hurricane, fire, flood, pandemic, or whatever. In fact that's almost exactly the formal definition. ISO 22301:2019, Business continuity management systems — Requirements, defines business continuity as the "capability of an organization to continue the delivery of products and services within acceptable time frames at predefined capacity during a disruption."

Business continuity planning, therefore, is all the planning you do to prepare for disasters before they happen, so you can get back to work smoothly afterwards. The first time you hear about it, you might roll your eyes: One more overhead task we've got to do before we can get back to work! But it's like any other kind of insurance. You never need it until you do.

As an example, think back to March 2020. The world was beginning to react to COVID-19, and there was a lot of excitement as organizations frantically tried to improvise what to do. But I worked for Bosch at that point, and some years earlier Bosch had required every plant and every office to define (and regularly review) a Business Continuity Plan for how we would respond to various kinds of disruption. One of the entries on the form was "global pandemic." I remember the meetings where we reviewed this plan, and back in the 2010's nobody rated that as a likely risk. But we worked out a plan for it, just to be complete. Then when March 2020 came and other companies were caught flat-footed, our General Manager pulled our plan off the shelf and it already spelled out exactly what to do.

How do you plan for business continuity?

I want to proceed in three steps. In this post I'll review the basics of risk handling, keeping in mind some of the salient points we've discussed in recent weeks.* In my next post I'll talk about why your business continuity planning has to be driven from the very top. And then in a third post I'll describe how to embed it into the organization as a living practice instead of a show for the auditors.

When I describe basic risk handling, I usually start with a concrete example that everyone can imagine. Think of the Safety Committee in a grocery store. They think of all the ways somebody could get hurt, and then define measures to keep it from happening. If someone breaks a jar of spaghetti sauce in Aisle 3, put up a “Wet Floor” marker and mop it up. Don’t put heavy things on high shelves. And so on.

Sometimes they identify a risk that’s not very likely: What if a customer brings his dog and the dog bites somebody? Yes, you want to know what risks you face; but you can’t prevent everything. So you rank your list in order of importance. Plan for the ones that really matter, and let the rest go. In general your ranking should consider at least two things:

How likely is the risk?
And how bad will the impact be if it happens?

In the simplest case (there are ways to make this a lot more complex!) you ask each question about each risk and answer with Low, Medium, or High. Then you use these two scores to assign a priority to each risk as follows:

Priority = Likelihood x Impact

	High	Medium	Low
High	High	High	Medium
Medium	High	Medium	Low
Low	Medium	Low	Low

On this scale, for example, “getting bitten by a customer’s dog” probably ranks Low for likelihood but High for impact, giving a composite priority of Medium.

(You can see that this rating method is similar but not identical to the one we saw for evaluating risk in audits: there we said priority = {number of nonconformities} x {severity}. The overall approach is very flexible.)

Then address all the important ones: this means at the very least all the ones where priority = High, but consider the others too to see if there is something you can do where the balance between effort and outcome is reasonable.

"Addressing" a risk means:

If possible, prevent it.
If you can’t prevent it, take steps now to mitigate the impact when it happens.
Also, consider how you will respond when it does happen: those are your contingency actions.
To make sure this gets done, assign the risk to an Owner, and assign a deadline by when the actions have to be in place. Then be sure to follow up that they really are.
And remember that root cause analysis of a risk can help you find the most effective approach.

What about the risks you choose not to address? They stay on the list anyway. And your priority ratings aren’t static. From time to time—at least once a year, if not more often—review your list to see if things have changed.

As you take mitigation steps, for example, the impact of some risks will drop and so their priorities will change.
The priority of others might rise, depending on changes in the outside world. Think how low most companies rated the likelihood of "global pandemic" in 2019.
Check whether your contingency plans are still correct and current. Is that still the best way to handle this risk, if it comes about?
Are the responsibilities all assigned to the right people? Are your supplies all in stock and up to date?
Assign actions as needed, and follow up to ensure the actions are closed on time.

In other words, your risk handling becomes a living system.

So even if a risk falls below your threshold and you don't address it right now, keep it on the list. Then the next time you review the list—next quarter, next year, or whenever—you can think about it again. And as long as it stays on the list, you won’t forget.

__________

* What follows in the rest of this post borrows heavily from a post I wrote more than a year and a half ago about basic risk management. But I have updated my remarks by taking into consideration the last month's sequence of articles. You will find links throughout.