Thursday, May 14, 2026

"Hardening in the field"

Years ago, I worked for a small tech startup. We were scrappy and energetic, and we hadn't quite decided how finished a new product had to be before we could ship it. 

  • Did it have to be bug-free? There were always more bugs. 
  • How about if it had No serious bugs? That sounds nice, but what counts as "serious"?

In all these discussions, our head of engineering usually wanted to ship now and not later. (Of course he also saw the financial statements, and knew that we needed the revenue!) His argument was that if the product was basically good enough, then what we needed was for it to operate in a real-world environment so that we could identify which remaining defects really mattered. Then when we fixed those, the product would be ready. Other nominal defects might exist, but they would be merely cosmetic. He called this process "hardening in the field."

"Let's see: eggs, cheese, filling. I guess it's ready to serve!"
Or maybe not.
One of our project managers remarked that shipping a product with the hope that it will get better at the customer site is like a restaurant serving up raw ingredients and then hoping that the meal will get fully cooked once it's at the table. But the idea isn't quite as bad as that. In fact, this line of reasoning is exactly why the tech industry introduced the concept of beta testing. Admittedly it is a dirty trick to ship beta-quality product to a paying customer who expects something finished. But companies frequently do need to see their products operate in a real-world environment, and some customers are so eager for new technology that they will accept the risk that the beta product might fail unpredictably. Once my startup matured enough to establish regular beta programs for our new releases, we stopped talking about "hardening in the field."  

"But wait—this is OK?"
So companies developing new products face competing demands. The need for real-world data pushes them to release sooner; customer expectations about those products may push them to wait until the basics are solid. I assume that nobody will release a beta-version automobile whose brakes don't work yet. (Though I might be wrong about that. See for example the discussion in this post, and the linked news articles.) Likewise most restaurants won't serve uncooked food, unless the customer ordered sashimi or carpaccio. But the high tech market is more confusing, because the expectations conflict.

Rapid innovation is a more or less constant feature of the high tech market landscape. Everybody knows that brand-new implementations of new technology are usually full of bugs; stable, reliable implementations take longer. So what do you do? Partly it depends on the inherent risks of the exact product you are designing. Is it a car or a rocket that can hurt people if it fails? Or is it a toy, where failure will just disappoint them? Is it easy to recover? And what does the regulatory environment look like? Obviously you have to take account of all these factors.

Beyond those factors, though, you may just have to decide where you want your organization to fit in the ecosystem of high-tech products: do you want to be first to market with innovative technology, or are you willing to trade speed of innovation for product reliability?

And then, if possible, you would like to design your Quality Management System so that it supports your decision—so that it nudges you into being the kind of company you want to be.

If it is important for you to be first to market, you should measure your development process with KPIs that track (among other things) how fast new releases reach the field. Since your initial releases are likely to be buggy, your customer support process should monitor KPIs that track the speed with which customer issues are resolved. You may wish to implement an Agile development model, or offer customers the opportunity to work with you as partners in exchange for providing their feedback as active members of the development process.

Conversely, if it is more important to you that your products be fully reliable before they reach a customer, then you should not measure speed of delivery as one of your development KPIs. What you measure is what you optimize; if you are willing to sacrifice speed for reliability, don’t measure speed. In this case, you are more likely to set metrics around the extent and comprehensiveness of testing, and the number of known bugs at time of release. You might also choose to use a waterfall development model (instead of an Agile one) so that testing is done on one version at a time, thus reducing the number of variables in the development process and presumably some quantum of risk. 

It's interesting to realize that "Quality" doesn't always mean the same thing—or rather, that it can mean two different things (in this case both speed of innovation and reliability of performance) which are incompatible, and which you have to choose between. And that single choice can have ripple effects across your metrics, your processes, and your strategy.


           

Thursday, May 7, 2026

The fight to define Quality

Five years ago, in one of my first posts for this blog, I argued that all the definitions of Quality that have been proposed in the literature mean effectively the same thing. Whether you talk about "conformance to requirements," or "fitness for use," or "excellence in goods and services," I argued that fundamentally you are talking about getting what you want out of those goods or services. So I proposed the umbrella-definition, "Quality means getting what you want."1 And at a workaday level, I still think that's fair. 

But it turns out there were reasons behind the fight over a definition, reasons besides academic posturing. Robert Cole explains in his Managing Quality Fads (a book that I discussed two weeks ago) that American businesses saw a profound shift during the late 1970's and early 1980's in the understanding of Quality, and the different definitions served as banners or rallying cries for the contending models. So while they may all mean (more or less) the same thing today, they meant very different things in the past. Thus it can be useful to understand what these definitions meant at the time, and what the parties were fighting over.

Old model

Because we know how things turned out in the end, some commentators have blamed American managers for failing to adopt Japanese Quality methods faster than they did.2 But Cole explains clearly that such a view is profoundly unhistorical. It is simply not true to allege that American managers were stupid (most of them clearly were not) or that they were ignorant about Quality. What is true is that they understood a great deal about Quality according to a framework that Cole calls "the old model"—but this framework proved unsuccessful in head-to-head competition with the "new model" practiced by many Japanese companies. So before they could learn the new model, they had to carry out two preliminary tasks first: they had to understand that there are more than one way to think about Quality; and then they had to unlearn much of what they already knew.3 

What was this "old model"? It was a way of thinking about manufacturing processes that had been responsible for a string of remarkable successes, starting with the industrialization of the American economy, and culminating—within the living memory of many managers still working in the 1970's and 1980's—with unconditional victory in World War Two as the "arsenal of democracy." And the basic principles behind the model are plausible:

  • Cole's table of "quality-hostile
    assumptions"4 (Click to enlarge.)
    Economic success comes from the division of labor, where everyone can focus on doing his own work as well as possible. This means that engineers should design products, managers should make decisions, and workers should do what they are told. Anything else means inefficiency and poor performance.
  • For most jobs, there's pretty much just one way to do them. Once you've figured out what that procedure is, and have implemented it, there's nothing more to do. Once a procedure is defined and in place, the only way to make it less expensive is to pay your workers less. (If your workers are unionized, that means moving overseas to a cheaper country.)5
  • These two facts—the division of labor, and the one-and-only way to do most jobs—mean that once an engineer has finished creating a design, most of the important decisions have already been made. At that point, the only degree of variability remaining is in how strictly the design is implemented, and in how carefully the procedure is followed. Stick to the design, and you'll have a good product; deviate, and you won't.
  • Products have to be "good enough," which means there is a "quality floor" beneath which they must not be allowed to sink. But no customer is going to pay for a gold-plated shovel, so it is foolish to throw money after a chimerical quest for perfection.6
  • We know what customers want. We've seen time and again that they want products they don't already have. So if we make the products—and history shows that we are fantastically productive at making things!—customers will buy them.  

There's more, of course. And I have included a summary of what Cole calls "quality-hostile assumptions" in the inset picture above. But it is important to understand that these assumptions were not stupid! In the event, they were indeed overturned by the new model of Quality. But each of the principles I listed has a superficial plausibility to it. And the overall model had been so successful in recent history that it looked like madness to question it.7 

New model

The new model of Quality reversed many of the assumptions of the old model. In some ways this is unsurprising, because it grew out of a historical situation that was nearly the opposite of the confident, successful American experience. Japan, after all, lost the Second World War, and her industrial plant had been largely reduced to rubble by American bombers. Japanese management responded to this catastrophe by mobilizing all Japanese—and in particular all employees—to support a collective effort to rebuild and regrow. In the face of a national calamity, everyone was asked to contribute, in any way they could.8 

Cole's table comparing the
old and new quality models.13
(Click to enlarge.)
But if everyone is asked to contribute, there is no room for a narrow-minded adherence to the division of labor. Naturally the engineer knows a very great deal about how to design a product; but he may not know everything about the particular set of machines and tools that we have to do the job here and now. Teams on the factory floor therefore help improve both the design implementation and the manufacturing processes.9

Early consequences are that industrial processes are never permanently fixed. They are always subject to improvement.10 Also, individual employees are not locked into a single role forever: all employees are assumed to be learners and potential problem-solvers.11 

And then feedback effects begin to appear. Improved methods mean less rework. But less rework means less cost, so high-quality manufacturing processes can be cheaper than low-quality ones—in direct contradiction to the expectations of the old model.12

The old model argued that continual improvement would run aground on the law of diminishing returns: it would take ever more cost and effort to make ever smaller improvements in anything. But the new model understood that there are always many targets for improvement. You can make this aspect of the job a little better, and then turn around to improve a different one. Briefly, you can make improvement continual by regularly picking new things to improve.14 

What's more, it turns out that customers are willing to pay more for products that work better and last longer! So an emphasis on superior product quality entails greater market share as well as lower costs. In other words, increased product quality drives improved corporate profitability. And an overall focus on customer satisfaction provides an effective framework to organize all these other efforts.15   

What about the definitions?

With this background, it is easier to see the competing definitions of Quality as slogans or ideological commitments to one model or the other.

The most obvious of these is Philip Crosby's definition: Quality is conformance to requirements.16 This definition is foundational to the old model. Remember, in that model the engineer designs the product and makes all the important decisions; once the design is complete, the best outcome is produced by conforming exactly to the design and the process ... in other words, to the requirements. Nothing more is needed in order to achieve Quality, and nothing more is wanted.

ISO 9000:2015 agrees with Crosby, in definition 3.6.2: "degree to which a set of inherent characteristics (3.10.1) of an object (3.6.1) fulfils requirements (3.6.4)."

The other pole is represented by Joseph Juran's definition: Quality is fitness for use. "Use" is what the customer does with the product or service, and "fitness" just means that it works. So Juran's definition means, in effect, that Quality is whatever the customer says it is. This is an extreme statement of the new model that Quality depends on customer satisfaction.

W. Edwards Deming and the American Society for Quality try to bridge the gap between the models. Deming calls out both sides by saying, "Good quality means a predictable degree of uniformity and depend­abil­ity [old model] with a quality standard suited to the customer [new model]." The ASQ, for their part, define that, "Quality denotes an excellence in goods and services, especially to the degree they conform to requirements [old model] and satisfy customers [new model]."

In any event, the battles are over. For the time being, the new model of Quality has won the day. So we can afford the magnanimous conviction that everyone who talks about Quality probably means more or less the same thing. But I think it can be valuable to understand how we got where we are.


Gosh, in all this discussion I never explained my own position. In case it's not obvious from everything else I have written here for the last five years, I throw my support firmly behind the new model. There have been too many cases where a company or an organization worked out a detailed plan, executed it flawlessly, and the results were just no good for anyone. (*cough* Edsel! *cough*) So I think that Quality ultimately has to mean goodness—and that for something to be good, it must (among other things) be good for someone in particular.17 Therefore I take the side of the customer in this debate between models.

At a theoretical level, I'm fond of the simplicity of the "innocuous truism" formulated by Robert Pirsig: "Quality is what you like."18 You can probably see the influence of Pirsig in my own definition at the top of this essay.



__________

1 More exactly, I went on to explain, "Whether you are talking with your manager or your auditor or a technician on the line, if there's a question about the relevance of this or that element of the system just ask, 'Are we getting what we want? Do we need this element in order to make sure we continue to get what we want?' If yes, the element belongs in your QMS .... If no, not." 

2 See, for example, this famous satirical pseudo-advertisement. 

3 In fact, Cole argues statistically that workers with deep backgrounds in traditional quality control functions were mostly passed over when companies began to create Vice President of Quality positions in the 1980's and 1990's. It was easier for their companies to "learn" new methods by promoting new men than it was for the employees themselves. "[The] overwhelming majority of the 174 vice presidents for quality did not have a career in quality prior to assuming their vice presidential role.... [Of] the Fortune 500 quality leaders surveyed, only 9% had quality responsibilities in their third previous job." Robert E. Cole, Managing Quality Fads: How American Business Learned to Play the Quality Game (New York, Oxford: Oxfor University Press, 1999), p. 44. 

4 Ibid., p. 76.

5 Ibid., p. 50.

6 Ibid., p. 47.

7 Ibid., p. 48.

8 "Various authors have noted that Japanese companies tend to magnify even small challenges as a strategy to mobilize all employees on behalf of aggressive new corporatewide goals .... [One] can suggest that large growth-oriented Japanese manufacturing firms throughout most of the post-World War II period have been characterized by a weakness orientation, which is designed to reveal problems in current performance to a broad range of employees. By contrast, comparable American firms throughout much of the postwar period were characterized more by a strength orientation." Ibid., p. 55. 

9 Ibid., p. 30. 

10 Ibid., p. 31.

11 Ibid., p. 30.

12 Ibid.

13 Ibid., p. 26.

14 Ibid., pp. 77-78.

15 Ibid., pp. 29-30.

16 References for this and the following definitions are given in this blog post. 

17 Cf., for example, William Blake, Jerusalem, plate 55, lines 60-63: "He who would do good to another, must do it in Minute Particulars, | General Good is the plea of the scoundrel, hypocrite, & flatterer: | For Art & Science cannot exist but in minutely organised Particulars, | And not in generalising Demonstrations of the Rational Power."

18 Robert Pirsig, Zen and the Art of Motorcycle Maintenance: An Inquiry into Values (New York: William Morrow & Co., 1974, 1999), p. 232.       

      

Thursday, April 30, 2026

Managing critical issues

We've talked before about problem-solving (see for example this post here), but what do you do when everything comes unglued at once—when a problem hits the fan, and suddenly you are fielding calls from reporters and attorneys when you haven't even gotten all the facts yet? Yes, you need a robust problem-solving protocol, but at the same time you need so much more than that! You need to manage the news cycle, because a careless or unguarded comment can turn public opinion implacably against you or ruin your chances in court, even though you haven't finished your investigation yet so you really don't know what happened!  


A few weeks ago, I attended a very thorough presentation on exactly this topic. The speakers were 
Shubhada Sahasrabudhe and Shalabh Tandon of QuRIuS Consulting LLC, the talk was sponsored by the ASQ Phoenix Section (704), and you can find a link to the YouTube video at the bottom of this post. The overall message is based on their recent book, Don't Panic, Pivot: Managing Critical Issues to Prevent Crises, and it is well worth your time and attention.

The authors start by explaining that a critical issue is not the same as a crisis, though it may cause a crisis if it is not addressed in a timely way. Major disasters can be caused by a succession of seemingly-trivial errors that were ignored, rather than set right; and the authors look at specific examples in some detail, including the Deepwater Horizon explosion, and the collapse of the Francis Scott Key Bridge. In each case, there were small signs in advance that something wasn't quite right. Based on this insight, they define a critical issue as follows: 

  • An issue is any unresolved event that disrupts or hampers the normal operation, and that fails to meet the published or agreed-upon specifications.
  • An issue is critical if it has time-sensitive impact (financial, safety, functional, etc.).
  • And if not addressed urgently, a critical issue can become a crisis 

Why does it matter? Well of course nobody wants to cause a catastrophic oil spill, or to destroy a major bridge. But the authors are careful to point out that the consequences of such a crisis affect multiple dimensions. Even if you momentarily set aside humanitarian concerns to take a cold-blooded look at the organization's own interests, a disaster hurts the organization in at least three areas: it damages operational efficiency, it can be ruinously expensive, and it soils the brand. (Regarding this last point, I have friends who check their travel plans to make sure they are not flying on Boeing planes, ever since the door blew out of Alaska Airlines 1282.*) So it is clearly in the organization's interest—even its narrowest self-interest—to watch for critical issues and correct them.

How do you do it? The authors sketch a method that looks, in broad outline, a lot like the protocol for handling an 8D. But there are a couple of critical additions.** 

  1. Right at the beginning, in D2 (State the problem), evaluate whether this issue is critical. Make sure you understand the severity of the consequences, in case the issue blows up.
  2. Then, during the impact assessment (typically also part of D2), make sure you address all relevant impacts: to operations, to finances, to stakeholders, and to your brand.
  3. During D3 (Contain the problem), it is not enough to prevent the problem from spreading. You also have to contain the brand damage by making whatever public statements are necessary and appropriate, consistent with what you know at this point. Never speculate in public! Stick to the known facts, and promise you will come back with the full story when you have it. 
  4. As you proceed through D4 (Find the root cause) and D5 (Define corrective action), validate your corrections before implementing them. Make sure that they address all affected stakeholders, and that they do not inadvertently cause further harm!

Perhaps the biggest change is that the authors break out Communication as a separate step between D6 (Implement corrective actions) and D7 (Assess risks and learn lessons). This addition confirms the point that harm to the brand can be as damaging as harm to customers or other stakeholders. And the authors give clear advice for all communication: make it timely, make it clear and unambiguous, and make it consistent. All of these points are important, but it is easy to overlook the last one. Or rather, it's easy, but don't overlook it! If you say one thing today and then backtrack tomorrow, people will think you are hiding something and assume the worst.

Finally, the authors break D7 (Assess risks and learn lessons) into two parts, because there is so much important work that has to be done there. On the one hand, they insist on the critical importance of a detailed After Action Review. But then you also have to implement Preventive Measures so that the problem can never, ever happen again. Both steps are important, so they list them separately to make sure neither gets shortchanged.

After all this, there's still a lot in their presentation that I haven't even touched. A single blogpost can do only so much. But check out the YouTube video below; if that inspires you, check out their book. There's a lot of detail here, and a lot of process to follow. But if you ever have to deal with a "critical issue"—even once!—you will be grateful to have a defined process that helps you keep your head amid the tumult. And of course implementing a system will be cheaper than trying to wing it, possibly by orders of magnitude. It's good to be prepared, and these authors can help prepare you.  


__________

* See also this post and the ones following.

** When I worked for Bosch, we had a similar procedure for similar circumstances. But this one is published, and to my knowledge the Bosch procedure has not been published yet. So I am happy to discuss this procedure.  

           

Thursday, April 23, 2026

Managing quality fads

Recently I've been reading a book about the history of our field, Robert E. Cole's study, Managing Quality Fads: How American Business Learned to Play the Quality Game. The book came out in 1999, so whenever Cole talks about "current practice in industry today" I had to remind myself to reset my mental frame by a quarter century. But mostly the book is a history. Cole examines the period when American manufacturers learned to rethink their approach to the Quality discipline, in the face of strong and successful competition from Japan. He starts his story in the early-to-mid-1970's, when American manufacturers first began to face serious competition from Japanese firms offering superior product quality. Then he brings it up to "the present day" (1999), by which time the awareness of Quality had become normalized in American industry and American manufacturers—some, at least—could once again compete with their Japanese counterparts on a level playing field. It's a big story, with a lot of moving parts. And ultimately Cole organizes his material to address one large, overarching question: How do organizations learn?     

The book is not one to read quickly. Cole writes for an academic audience more than for a business one, and there is a lot of detail. Often it seems that Cole is telling us everything he knows about a topic, even when he could make his point more clearly by saying less. And there are many places where he describes a Quality initiative in a certain company, only to start the next paragraph by saying (in effect), "Of course there were other people in the same company doing the exact opposite." It takes him a long time to paint the full picture.

In one sense, though, Cole's presentation is relentlessly true to the facts where a clearer storyline would distort them. When American firms were first successfully challenged by Japanese companies that reliably produced higher quality products, they didn't know what hit them! Cole makes it clear that the American manufacturers didn't understand what made the Japanese products better—and not only did they not know how to improve their own operations, but they didn't even know what "improve­ment" might mean. Cole's slow, methodical "on the one hand ... but on the other hand" exposition can be frustrating if you want answers: What finally worked? How do organizations learn? But the players at the time felt that same frustration too. There were a lot of false starts, a lot of "quick-fixes" that went nowhere, and a lot of confusion all around. Cole's very deliberate approach to his story helps the reader to feel that, and to appreciate it in retrospect.

The other advantage to Cole's presentation is that it is rich in detail. There are a lot of little nuggets that are worth exploring for their own sakes. Over the next few weeks I plan to pull out some of these nuggets and write about them. Oh, I'll write about other things too. But I expect to come back to this book several times in the blog, before I finally put it back on the shelf. There is a lot here.


For today's post, I'll limit myself to summarizing Cole's final conclusion, in an abbreviated form. (I can write another post later with more detail, if there's interest.) Cole introduces the last section of his last chapter in a way that feels distinctly unpromising:

[The] question is often posed: how do managers learn to identify best practices and diffuse them across an organization? Survey data suggest the answer is, not very well.*   

But then he describes the approach which, in the long run, seemed to be the most productive. He focuses his results in two ways. 

  • First, his attention is on large organizations with many divisions: partly, no doubt, because those are the companies he got to know through his consulting practice; but also, I think, because those present the most difficult case for organizational learning. The size of such companies means that any organizational transformation has to be adopted by many people; since the number of affected individuals is large, progress is necessarily slow and the institutional inertia is enormous.
  • Second, he is writing specifically about how large corporations learn "under conditions of uncertainty and incomplete information"**—exactly the conditions that characterized the Quality challenge from Japan. 

In that context, Cole identifies the most robust approach to be a collaboration between a central function at headquarters and implementation in the "periphery" (i.e., the operating divisions).

  • The central function sets high-level goals, targets and incentives, but does not specify exactly how they should be implemented. This is because the prevailing "uncertainty and incomplete information" mean that headquarters doesn't really know how they should be implemented!
  • The operating divisions figure out methods that work for them. Since they are the ones who actually do the work, they will have a much better understanding than headquarters does of what methods are practical and what are not.
  • But then the central function monitors the innovations and the results, "to identify, synthesize, and diffuse best practices; otherwise the mutation process [i.e., local innovations in the operating divisions] will lead to dilution and degeneration."***

During the twenty or thirty years when American companies were relearning how to think about Quality, there were many different approaches. But this one seems to have been the most robust, involving shared responsibility across the organization: with central guidance and local adaptation. 

In fact, Cole argues that the same center-and-periphery topology describes how Quality knowledge spread across the country as a whole, from early-adopting industries even to those that were more protected from the challenge. But that, as they say, is another story. 

__________

* Emphasis mine. The quote is from Robert E. Cole, Managing Quality Fads: How American Business Learned to Play the Quality Game (New York, Oxford: Oxfor University Press, 1999), p. 243. 

** Ibid., p. 246. 

*** Ibid., pp. 245-246.      

           

Thursday, April 16, 2026

Goals that your people understand

You're at work, it's the middle of the morning, and suddenly everyone is called into the largest room you have, for a presentation. The CEO and senior management have just finished their new strategic plan, and they are going to share it with the rest of the company.

Do you get anything out of the next hour? Or do you just spend the time trying to look awake?

I've been in too many presentations where the latter was true. The CEO starts off by saying, "We have two main goals over the next three years: to become the number two supplier of refrangulated widgets, and to reach a market capitalization of twelve gazillion dollars. Here is how we are going to do it ...." Then the rest of the speech might as well be in Babylonian, for all that I under­stand it. I'm sure the CEO is following advice he read somewhere, that he should make all employees "partners" in the company's "strategic thinking." But because the message is in terms I don't know, the only thing I get from the meeting is that I am now an hour behind on the day's work.

It doesn't have to be like this.

What does a better way look like?

There is another way to roll out corporate goals, one that makes them meaningful to every employee. It takes a little more work up front, but this is the kind of work that the management team is paid for in the first place—so it's fair to ask them to do it. Also, if they run into problems in the preliminary setup, that's a key indicator that there are problems with the strategy itself. So it is worth the effort.

The method is called Hoshin Kanri (Japanese: 方針管理, "policy management"), and it draws a straight line between the company's long-term goals and the work I have to do tomorrow. This helps me understand the company's goals, because I can see the effect they have on my job in particular. But it also helps me see how my job fits into the big picture.

Ironically, I saw the method used long before I learned it had a name. I just thought of it as "The way That Company does goals," and I wondered "Why doesn't everyone do this?" But of course I couldn't ask my next employer, "Why don't we do goals just like that other company I used to work for, that you've never heard of?" It was a relief to learn the name.

How do you do it?

The whole process unfolds in several steps. Some people use a special matrix to organize their work, but I won't do that here. The logic is the important part, and you can organize it however you choose.

Define your strategic goals

First, you have to define your long-term strategic goals. Where do you want your company to be in five years? Be careful not to define too many goals, but focus on the handful that matter.

You may also identify your most important operating imperatives at this point. But again, be careful not to cloud the picture with too much noise. (For the distinction between strategic goals and operating imperatives, see the discussion in this post, under "What is a strategy, anyway?".)

How are you going to get there?

"A goal without a plan
is just a wish."

Next, plan out very concretely what you have to do to reach those goals. A familiar aphorism attributed to Antoine de Saint-Exupéry says that "A goal without a plan is just a wish." So define the actions you will have to take, and the milestones that will prove you are on track.

In the first instance, this means defining annual goals as progress towards your long-term targets. But it also means spelling out what your goals will look like, concretely, when you have achieved them, and then identifying what it will take to cross the gap from Here to There. If you want to be—let's say—the number two supplier of refrangulated widgets, what does that tell you about your warehousing and logistics systems? What level of performance do they have to reach, in order to support the overall corporate goal? But also, what is their performance today, and how far does it have to improve? Can you spell out achievable interim milestones towards which your logistics and warehousing personnel can aspire, that will get them where you need them at the right time?

And of course reaching this goal isn't even primarily about warehousing and logistics, though doubtless those play an important role. Every single department in the company should contribute to these goals somehow. So the CEO has to delegate to the respective department heads the task of working out maps for each of their functions which will support the common strategy.

As an aside, you should check that each department map is consistent with all of the others. If Engineering plans to develop the Next Generation Widget in Year Two using a special technical tool that IT doesn't plan to install until Year Four, somebody has to change his map!

Cascade downwards 

It doesn't stop there, but the next steps are pretty straightforward. You as a department head (or functional VP, or whatever your title is) now have a strategic map for what your department has to achieve in five years, and also in this year. Take it to your section managers or group leads, and go through the exact same exercise. Ask each of them to spell out how their group will contribute to meeting your goals. Notice that they don't have to reach all the way back to the company's goals, because your goals have already been aligned with the higher level. So as long as they support achieving your goals, they are also supporting the company as a whole.


Cascade this exercise down through the company organization, all the way to the shop floor. (Yes, this is exactly the same procedure I recommended three years ago for business continuity planning.)

The result is that I, as an employee, have personal goals to achieve that are directly related to my job. But if I achieve them, that supports my supervisor in achieving his goals, which in turn supports the department manager in achieving her goals, which ultimately rolls all the way back up to supporting the company in achieving its strategic targets for the year.

End of the year

Then at the end of the year, you evaluate how you did. This means everyone, at all levels. But the point isn't just to assign a grade, like in so many performance review systems. The idea is rather to carry out a root cause analysis on each missed target, to learn why it was missed, and then to update your plans with this new information. This way you—individually and as an organization—keep in touch with reality, learn lessons from experience, and adapt your strategy pragmatically.

I will admit that it is hard to remember to do this last step. Even when I worked at a company that did all the rest of it, that step was sometimes missed. But of course it is important.

           

Thursday, April 9, 2026

Make your matrix certificate work

Last week I introduced the concept of a matrix certificate, and I thought it could be useful if I explained the approach in a little more detail. Mostly the idea is self-explanatory, but it is worthwhile to keep one or two points in mind so they don't trip you up.

When ISO first introduced its management system standards, each certificate was tied to a specific location where specific work was done. So the certificate was printed with an address and a scope statement, as in the example to the left.* 

But many organizations have more than one location, and often the processes are identical from one site to the next. There is not a lot of variety, for example, between this McDonald's and that one. So organizations began asking their Certifying Bodies** if there were any way to group multiple locations under a single certificate. Since the CB's, for their part, had to manage a lot of redundant information for all of these sites, they were keen to agree. And so CB's began to introduce the "matrix certificate."

A matrix certificate works just like a standalone certificate, except that the address and scope are replaced by a matrix: Site 1 does this kind of work, Site 2 does that kind of work, and so on. There is no requirement for a literal matrix on the certificate itself, so long as the information is there. A matrix is a convenient format when there is a large number of sites to consider. When there are only two or three, the CB might simply use paragraphs, as in the example to the right immediately below.***  

The other place you see a genuine matrix is in the internal paperwork that supports the certification: at the CB, and at the organization itself. These internal matrices look something like the one that I show at the bottom of this post, and they correlate the infor­mation needed to manage the audit program. This starts by listing the sites and their scopes or func­tions, just like on the certif­icate. But then the internal matrix also tracks how many people work at each site. This is because there is a formula used by all the CB's to determine how many audit-days**** they have to schedule at each location. The formula uses both number of employees and type of work as inputs to the calculation: a big plant needs more audit-hours than a small plant, and manufacturing requires more attention than HR or sales. Finally, the matrix should track how long the audit cycle is for each site. Normally with a multi-site certificate, the company headquarters is audited every year; also, manu­fac­turing is likely to be audited every year. The example I give below has one site for each of those functions. But then it also shows three sites that mostly do engineering or product design, that follow exactly the same procedures, and that are therefore more or less interchangeable. In order to reduce the overall audit costs (and to simplify scheduling), the CB has put each of these on a three-year cycle, so that they rotate. In any given year the CB plans audits in three locations; but over the three years before recertification, they visit all five sites.

On the whole, using a matrix certificate is a good way to control your certification expenses if you are in a situation where it works for you. But of course there are risks. The main risk is that one of the critical entries in your matrix will change: for example, you might have heavy layoffs at one site, or (conversely) hire a whole new department. Either way, you have to make sure to report the change to your CB in a timely way, so that they can recalculate your audit days. One way or another, this might change your plans for the year.

Or you might move functions from one site to another. Since each function makes its own contribution to the calculation of audit days, this move might make a difference to the result even if the number of employees involved is not high. A specialized technical function, for example, might employ only a few people; but it might require a lot of Quality equipment to make sure it operates correctly, and in some cases all that infrastructure might have to be audited.

Finally, there is a risk that two different sites are supposed to be doing the same work according to the same procedures, but one site decides to implement local improvements without telling the other. I have seen this happen, where an organization had two Distribution Centers. One was more or less a glorified shipping dock at the far end of a building that did a lot of other work; the other was a pure Distribution Center with no other functions. Back when the organization first codified their written procedures, someone wrote a single set of logistics documents and told both organizations to use them. But in the intervening years, the Site Manager of the standalone Distribution Center brought in a lot of equipment to modernize his operations. It was exciting to visit, because every time I arrived they had improved something new. But somehow the word never got back to the first operation. I had workers from that one tell me with a straight face, "We have exactly the same procedures they have there," when there was no longer anything in common. And I had to explain, "No you don't. And at the very least your documentation has to be updated to reflect that reality."

Note that when changes like this take place, it can undermine the logic behind a matrix certification. The justification for auditing Site 3 in place of Site 4 and Site 5 is that they follow the same procedures in all three sites so you'll see the same evidence anyway. Once that is no longer true, it is harder to keep down the number of audits.  

It is also true that breakthrough improvements on such a scale are mostly the exception rather than the rule. And if there is a chance for your organization to make breakthrough improvements, having to replan your audit program is a small price to pay!



__________

I found this illustration online by searching for "sample iso 9001 certificate." I have no professional, personal, or financial connection to the organization in question. 

** A Certifying Body (CB) is the registrar that send out your external auditor, manages the audit paperwork, and issues your certificate. There are lots of CB's in the world. If you look at the certificates that I posted above as examples, the first is from TÜV SÜD, and the second is from DEKRA. These are both well-known CB's. Just by coincidence, I have never worked with either one.  

*** Again, I found this illustration by searching online and have no connection with the company. 

**** One audit-day means one full day of auditing by one auditor. So four audit-days might mean one auditor for four days, or two auditors for two days, or one auditor for three days plus another who joins him for only one of those days, or any other combination that adds up to four.     

           

Thursday, April 2, 2026

How do you certify a big corporation?

Last summer, in the middle of a podcast about something else, Kyle Chambers raised a question: how is it possible to certify a really big company to a standard like ISO 9001? His point was that really big companies have so many parts that they can't all play together. (I mention the podcast in this post here.) Kyle might have meant the question rhetorically, but of course he is right. If you have scores of locations spread across multiple continents—engaged in dozens of lines of work—how can you possibly coordinate them all into a single management system? How can you possibly certify anything so complex?

You can't. So you break it in pieces.

More exactly, you divide your really big company into sub-units of a useful size, and then make each sub-unit responsible for its own certification. Then in your sales and marketing literature you speak of the company as a unified whole, announce that "Con­glom­erated Enterprises has been proudly certified to ISO 9001 since ...," and follow with the date of the first certificate that came through.

How big is "a useful size"? It really depends. I've seen it done several ways.

In general, there are two competing pressures at play in determining the right size for an entity to serve as the scope of certification.

  • On the one hand, you don't want the scope too narrow, or you'll have to pay for too many audits. In other words, if three offices that are all in the same city all do exactly the same kind of work, you can save some money by letting them share a certificate. You pay for less overhead at the Certifying Body, and you pay for fewer audits (because all three offices are doing the exact same thing). This all looks good on a balance sheet.
  • On the other hand, you don't want the scope too broad, because there's always the risk that someone goes crazy one day. If you have a hundred locations all sharing the same certificate, and someone in a tiny office way out in Far Foodle starts violating an important policy, the next auditor might rate it as a Major Nonconformity. Then that Major might put the certificate at risk for all hundred locations!

Where you draw the line while balancing these two imperatives depends a lot on your organizational culture. I've worked for one multi-site operation whose corporate ethos was entrepreneurial, and who had created a number of sites by acquisition. So there was a distinct chance that two sites might not be on the same page. For that company, each site was strictly responsible for its own certifications. The company had a blanket contract with a CB, because they got a volume discount. But inside that blanket contract, we were on our own. I made the arrangements for my location, but not for the others; I worked with the auditor and tracked the findings for my location, but not for the others. We never had a crisis where one location risked losing certification, but the company's management took no chances.

But then later I worked for another company that bundled eight sites across the United States all into one certificate. We supported the same product line, and we were all in the same geographic region. So the company decided that was enough commonality that it made sense for us to share a certificate, and we could sink or swim together. I made the arrangements for eight locations, tracked findings for eight locations, and flew around the country a fair bit to support audits when they happened. 

This meant I carried out eight internal audits a year, but not nearly so many external audits. We had a "matrix certificate," which meant that the CB did a sampling every year: they always audited headquarters, and they always audited our one factory, but then they would pick one or two other sites randomly and leave it at that. Over the years, of course, they ended up visiting every site; but every spring I had a long discussion with their scheduler to agree where they should go. 

"You visited A last year, and you visited B the year before, but you've never audited C. How about seeing them this year?"

"Wait, we've never been to C? What do you even do at that site?"

"It's basically the same work as B, so I don't expect any significant findings. But you haven't been there yet, so it might be good to visit."

"Sure, I guess. How many people work there ...?"       

As long as the processes and systems really are common across locations, this can be a useful way to proceed. Next week I'll say a little more about how matrix certificates work.

       

Five laws of administration

It's the last week of the year, so let's end on a light note. Here are five general principles that I've picked up from working ...