Thursday, October 23, 2025

Hierarchy of hazard controls

When I find that I've interrupted myself—twice in a row!—to make a disclaimer that's no part of the main post, maybe I need to pay attention. Maybe it's time for me to discuss the topic on its own, to get it settled, rather than pushing it off into footnotes.  

My last two posts—last week and three weeks ago—were about how to use written procedures. In both articles, I explained that written procedures should be regularly enhanced with the lessons learned from mistakes or disasters, so that the organization learns from those mistakes and doesn't repeat them. And both times I had to include a little caveat, to the effect that updating procedures is often not the best way to prevent safety problems.

Why did I bother to say this—especially twice? Also, what is the best way to prevent safety problems?

For the first question: I bothered to say it because updating procedures is probably the easiest way to address safety problems. Typically it costs less than any other approach, and it usually takes less time. But it is also one of the least effective ways to address safety problems, because people forget what they read, or decide to ignore it, or never get around to reading it in the first place.

For the second question, ... well, it depends. Classically there are five options, but they aren't always available in every case. So you have to see what you can do in each specific situation.

By Original version: NIOSHVector version: Michael Pittman
https://commons.wikimedia.org/w/index.php?curid=90190143

Elimination 

The most effective way to control a hazard is to eliminate it completely, but this isn't always possible. If your workplace has extension cords stretched across walking areas, those constitute a trip hazard. Get rid of the extension cords, perhaps by installing power outlets where you need them or by rearranging your workstations, and you have eliminated the trip hazard. If some work is being done high above the ground, there is a falling hazard. If you can relocate the work to ground level, you have eliminated the falling hazard. Again, this is the most effective approach—the hazard is gone, after all!—but sometimes it is not practical.

Substitution

The next-most-effective approach is to substitute something less dangerous for the original hazard. A common use-case for substitution involves the use of hazardous chemicals, because sometimes there is a less-hazardous chemical that will do the same job. Some operations have replaced the solvent benzene, a carcinogen, with toluene; others have replaced lead-based solder with lead-free solder. These substitutions generally cannot be done overnight: lead-free solder melts at a different temperature than the lead-based original, so converting a printed circuit board to lead-free solder requires sourcing new components and re-laying out the board. Still, it can be done. 

Engineering controls

Engineering controls do not remove the hazard, but isolate it. The easiest example is a guard rail or shielded enclosure to keep fingers out of machinery, or a ventilation hood to shield people from breathing noxious gases. Lockout-tagout mechanisms serve a similar purpose by ensuring that a machine cannot be serviced until it has been powered off and disconnected. In all these cases the hazard still exists, so if someone went out of his way to override the engineering controls there is a theoretical chance he could be injured. But he would have to go out of his way. In normal operation, engineering controls should keep people from getting hurt.  

Administrative controls

This is where we talk about updating your procedures! Administrative controls are all the measures that rely on telling people not to do things that can hurt them: they include written procedures, but also training, signs, and warning labels. Other administrative controls could include job rotation or work schedules, to reduce the exposure of each individual worker to a certain hazard; preventive maintenance programs, so that the equipment functions properly; scheduling certain tasks during off-peak hours, when fewer workers are present; or restricting access to hazardous areas. All of these measures are important, and they certainly have a place alongside more effective measures. It may also happen, because of special circumstances at your workplace, that sometimes these are the best you can do. But they all rely on human compliance. And as we have seen, human compliance is not always reliable. That's why administrative controls rank so low on the effectiveness scale.

Personal protective equipment (PPE)

Finally, sometimes you just have to walk in and grab the hazard in both hands. After analyzing it every possible way, you find that you can't eliminate the hazard and can't substitute it; and because the work requires direct human action at that point, engineering and administrative controls are beside the point (because both of those are designed to keep you away from the hazard). Fair enough. Do what you have to do. But at least wear gloves. Or a breathing filter. Or a hazmat suit. Or whatever the right PPE is for this particular hazard. PPE is rated as the least effective form of hazard abatement, because the only time you use it is when you are getting up close and personal with the hazard itself. But sometimes that's what you've got to do, and PPE is just what you need.

Once upon a time, years ago, I was talking to the management team for a mine. (They were mining diatomaceous earth, not coal or gold, but I bet the principles are the same.) I asked them if their employees tended to suffer from emphysema, or other lung ailments. They said that back before the 1950's, yes, that was a big problem. But in the 1950's someone invented a breathing filter which screened out the tiny particles of diatomaceous earth and other rock products, and after that they'd never had any trouble. I asked about enforcement, and they said: 

"Oh, that's easy. We painted a white stripe across the road into the mine. Then we announced that anybody who was found on the other side of the stripe without his breathing filter in place and working would be fired. On the spot. No questions asked. No excuses. No matter who.

"And you know? We haven't had a single problem since then."* 

PPE may be ranked as "least effective" but sometimes it's exactly what you need.



Anyway, that's the hierarchy of hazard controls. That's what's behind the little disclaimers in my last two articles. I hope it helps.

__________

* Technically this means they used PPE, reinforced by administrative controls (the white stripe).

       

Thursday, October 16, 2025

Chesterton's fence

For the last couple of weeks (well, with one brief exception) we've been talking about written procedures: how they help avoid failure, and how to use them to capture the right lessons in case failure comes anyway. Specifically, I argued two weeks ago that when something goes badly wrong with one of your processes, it's good to analyze the failure to find a root cause; then, if the root cause was that someone acted a certain way, update your procedure so that he won't do the same thing next time.*   

But wait—what if you inherit a procedure, instead of writing it yourself? I spent a lot of my career working for small companies acquired by large ones, so that's the case I have in mind. The Home Office says to follow a procedure, but that procedure calls out forms you've never seen, and involves roles you've never heard of. 

Let's make this concrete. The Whizzbang Project is running late, but finally they think they can start testing. The team has met for a review. You have the official Test Readiness questionnaire from headquarters. The first few questions are easy. Then suddenly you read:

Question 17. Has the Whitzinframmer Report been duly refrangulated by the Junior Executive Pooh-Bah?

What are you supposed to do with that? Your office doesn't use that report. In fact you've never seen one. And the nearest person executing that role is across the ocean. Everyone in the meeting is staring at you. Now what?

The temptation is enormous just to skip it. But after all the discussion two weeks and three weeks ago about "procedures written in blood," you know that's not the best answer. On the other hand, you can't answer it as-written. What you need to find out is, What risk was this question written in order to avoid?

The key is that there aren't that many different ways to manage a project, or to fly a plane. Project managers around the world face exactly the same risks, and mostly use the same pool of solutions. Pilots around the world face the same laws of physics to keep their airplanes aloft. I guarantee that if modern project managers and civil engineers could sit down with the people who built the Pyramids, they'd be fast friends before they ran out of beer.**

So when you call somebody at the Home Office to ask about the Whitzinframmer Report,*** you don't need to reproduce every single field. But make sure you understand its purpose. Once you get past the window-dressing, it's sure to be a tool they use in the Home Office to handle some very normal project management risk. Getting that report "duly refrangulated" is how they check that you have enough budget for the next phase of the project ... or maybe it verifies that the test equipment is all working correctly, or something like that. In all events it will be something very normal. Then instead of asking the question literally, as written, ask whether the risk has been addressed. 

This means you say, "Question 17. Do we know if all our test equipment works?"

As a quick aside, I am not a pilot. If you are flying an unfamiliar plane, and if you find that you don't understand some of the instructions in the flight manual, I do not advise you to substitute free interpretations instead. The laws of physics are unforgiving. Also, it is a consistent theme in this blog that your level of effort should be proportional to the risk you face, and flying an unfamiliar plane involves a lot of risk. So it is worth the effort to know what you are doing.

But in more forgiving environments, there is more latitude to apply procedures in ways that make them useful. And the key is always to understand that the procedure itself is a tool for minimizing risk. So if you find that the procedure cannot be implemented as written, make sure you understand the risk that has to be managed. If you can neutralize the risk, that's ultimately the goal you are trying to achieve anyway.   

By the way, the approach that I recommend here is a special case of a principle called Chesterton's fence. Briefly, the idea is that if you find someone has put up a fence in an unlikely place, and you can't for the life of you think why, don't tear it down! They must have had a reason. It might have been a bad reason, or the reason might no longer apply. But until you know what the reason was, you had better leave the fence in place. "Written in blood" is a more dramatic way to say it, but the idea is the same.****



__________

* The current article is mostly about procedures and not safety, but note that procedural controls are not always the best way to address safety problems. I'll talk about this more next week. 

** The ancient Egyptians did brew beer, and each worker on the Pyramids got a daily ration of four to five liters, for both nutrition and refreshment. See Wikipedia, "History of beer" for more information. 

*** You should do this before the meeting!  

**** The full description of this principle comes from the author G. K. Chesterton, and is much more colorful: "In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, 'I don’t see the use of this; let us clear it away.' To which the more intelligent type of reformer will do well to answer: 'Tf you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.'" From G. K. Chesterton, The Thing (London: Sheed & Ward, 1946), p. 29.         

      

Thursday, October 9, 2025

Podcast with Quality Magazine!

We've been talking lately about how formal processes can avoid catastrophic mistakes and I've got more to say on the subject. But this is a timely interruption. 

A while ago, I sat down with Michelle Bangert of Quality Magazine, when they published my article about the Seven Quality Management Principles. Originally, we were just going to talk about the article itself, and maybe to recap it for people who prefer podcasts to blog posts. But the conversation unwound itself according to its own internal rules, the way any good conversation does. After forty minutes we had discussed at least a dozen topics; in some ways it felt like we had been talking all day, and in other ways it felt like we were just beginning to scratch the surface. Among other things, our conversation touched on topics like the following:

  • How I came to write the article on the Seven Quality Management Principles.
  • When to expect the upcoming changes to ISO 9000 and ISO 9001.
  • How blogging is different from kvetching.
  • How to use blogging as a branding tool.
  • Why I am delighted when people argue with things I've written. 
    • (As a bonus, I describe two different times I've had to retract something I'd written because feedback from readers showed me I was wrong: I mean here and here.)
  • How lessons from parenting also apply to Quality.
  • How my career in Quality started, and why you shouldn't imitate me.
  • Career highlights and stories for audit nerds.
  • Comparing Stuttgart with Santa Barbara as wine countries.
  • The hidden message of German architecture.
  • Why do I anonymize my stories, and when do I not?
  • Where is the other column that I write, and what is it about?
  • What is the difference between rules in young/small organizations, and rules in old/large ones?

Anyway, two days ago the podcast was published. You can find it here. (Here is an alternate link.)

So take a listen, and let me know what you think. 

  • If you think I'm wrong about anything (or everything!), please let me know: like I say above, I'm always thrilled when someone argues with me.
  • And if you like it, contact Michelle Bangert at Quality Magazine to ask her to have me on again! 😀



    

Thursday, October 2, 2025

Procedures written in blood

Last week I wrote about the Challenger disaster, and about how to avoid the "normalization of deviance" that made it possible. One of the critical topics was to stick to the defined procedures, and I quoted the Air Force maxim that "The flight manual is written in blood." In other words, many of the flight regulations were created only after someone did something else one day, ... and then crashed.

Stories like these are a gruesome way to make the point, but wrapped inside this advice is an important principle on how to write and manage formal procedures:

  • If something goes wrong—and especially if somebody gets hurt—analyze the accident to find the root cause
  • Then if the root cause is something that could have been avoided if only the agent or operator had acted differently, update the written procedure to require future operators to do the safe thing. 

Way back in the first year of this blog, I wrote a post about how to write procedure documents which alluded to this issue but didn't go into details. What I said at the time was just, "If something is a safety guideline, spell it out." What I neglected to say was that often you learn the relevant safety guidelines by studying accidents and figuring out how to avoid them next time.

What is more, this advice isn't limited to safety risks. Any time you see a predictable failure mode that can be avoided by taking preventive action ahead of time, you should consider writing it into your procedures. Do you remember back when I wrote that all of Quality is built on the practice of Lessons Learned analysis? This is what I meant.

Don't go crazy, of course. Sometimes the risk is negligible, and it would take a lot of work to prevent it; in a case like that, maybe it's better to accept the risk and get on with things. But when the risk is substantial or even lethal, updating your procedures is a small price to pay for prevention.

I once worked in an office where we developed a checklist like this very organically. We were a small office that had recently been acquired by a much larger company, and the larger company had insisted we implement stage gate questionnaires to monitor and control our product development process. (I explain project stage gates in this post and this one.) But our administrative and IT landscapes were different from those in the home office, so we used some forms they didn't have, and vice versa. To account for our local forms, I created a local questionnaire with three or four questions on it.

To my surprise, the local questionnaire caught on. One of our projects did something ill-advised that set them months behind and wasted a bunch of money; we called a Lessons Learned meeting to figure out what went wrong. One of the outputs was that the Project Manager had failed to check for this-or-that condition at an early stage of the project. The PM's answer was, "How was I supposed to know we needed that?" And right away another team member said, "It's crazy that we forgot to check for that! Michael, can you put that on your checklist—that the Project Manager has to check for this point at that stage-gate review?"

Sure, I could do that. And over the years, the checklist grew.        

To be clear, updating procedures isn't the only way to prevent accidents. Depending on the risk, sometimes it's not the most effective. If you need to keep people from sticking their fingers into a dangerous machine while it's running, you'll have more success by installing a guard rail or a plastic shield than by writing a procedure that says "Don't stick your fingers in the machine."

But for other operations—flying an airplane, say, or managing a project—we depend on human action. And in those cases, regularly updated procedures are invaluable as a way to learn from the mistakes of the past. As one humorist wrote, "It's a wise man who profits by his own experience, but it's a good deal wiser one who lets the rattlesnake bite the other fellow."


      

Five laws of administration

It's the last week of the year, so let's end on a light note. Here are five general principles that I've picked up from working ...