Thursday, December 9, 2021

Process fragility — or — "People Before Process" Part 3 of 3

In last week's post we saw that there are powerful reasons why companies build up their process systems, while the motives to build up their people are often less obvious or less urgent. But the week before, we saw that in the long run it looks more important for an organization to have the right people than to have the right process; because good people will improve a bad process, while bad people will degrade a good one. What does this mean? Is it just one more case where the easy and obvious motives line up in support of short-term benefits at the cost of long-term ones?

Maybe so, but in the full picture we can see other things as well. The first of these is that a reliance on process is fragile, while a reliance on competence is resilient. Let me tell you a story.

Once upon a time, I helped to support the Quality system in a small factory. The factory had run successfully, under one owner or another, for most of the 20th century; some people operating the lines had worked there all their lives and were nearing retirement age. Recently the factory had been acquired by a new owner, and part of the "post-merger integration" was to implement the new owner's QMS across the board. This meant, among other things, generating Control Plans for every factory operation — something that had never been done before. A couple of manufacturing engineers were assigned the task; they made a quick inventory of all the things the factory could do, listed the steps for each in an Excel table, and published the results as Control Plans.

Then one day it was time for our external surveillance audit. In order to audit section 8.5.1 of ISO 9001:2015, the auditor asked for a Control Plan. We offered him several, and he picked one that covered the plating bath. Then he walked out on the line to watch it in action. Right away he discovered that one of the vats was at the wrong temperature. The defined reaction in the Control Plan was, "Stop the line and call the manufacturing engineer," but the line was still running. Our auditor had been watching the process for less than five minutes, and — presto! — he found a Nonconformity.

When the day was over and the auditor had gone back to his hotel for the night, my boss and I walked out on the line to ask the operator what was going on. What was he thinking, to keep running the line when the temperature was significantly outside of range? He wasn't the least bit bothered. He explained that the plating reaction depended on both the temperature of the bath and its chemical composition. When he saw that the heater for one vat was malfunctioning, he changed the chemical composition of that specific stage of the bath to compensate. The final output would be indistinguishable; the customer would get exactly what they ordered, and there would be no need to delay this production order. And a good thing too, because he happened to know that the responsible manufacturing engineer was on vacation for another week yet. But he assured us it was all fine. The product would be correct, and the customer would be happy.

"All fine" is a matter of perspective, of course. My boss and I had to do a lot of talking to persuade the auditor to rate this Nonconformity as a Minor and not a Major. But from the customer's perspective it really was "all fine." The product that shipped to the customer really was going to be indistinguishable from one that had been made at the right temperature and with the defined chemical bath. This means two things:

  • From the perspective of the audit, the finding really should have been a Major Nonconformity, because the system was absolutely not working the way it was defined (on paper). The written Control Plan said that if anything was out of adjustment, the whole process should stop until the responsible manufacturing engineer could review the situation and instruct the operators what to do. (And that would have been another week, at least.)
  • But if the organization had followed the written Control Plan, the order would have been a week late — needlessly! In this particular case, the operator himself already knew exactly what to do because he was so deeply familiar with the process. Because the operator could rely on his own competence, work did not stop ... the order was not late ... and the customer was not disappointed. Because the operator could rely on his own competence, the organization could confront an unexpected problem and then roll with it — resiliently.
It still should have been a Major Nonconformity from the perspective of the audit. But probably the operator never even looked at the Control Plan. That should have been another Nonconformity, come to think of it.*

This is what I mean when I say that relying on process is fragile. No written process can possibly cover all situations that might arise, so every written process runs the risk that one day the organization will face a situation that the process does not address. When this happens, the process breaks down. But relying on competence is resilient, because a well-trained expert with deep knowledge of the process can figure out a response to any unexpected situation, with a high probability of getting it right.

Notice something else. The whole pattern of thought and planning that underlies modern industrial capitalism favors this fragile, process-based approach over the resilient, competence-based one. For consider:

  • On the one hand, someone who is simply trained to follow a process (and no more than that) is unprepared to solve problems or handle novel situations on his own. But he is a lot cheaper than the employee with wide experience and deep knowledge.
  • On the other hand, most of the time your organization shouldn't be facing problems or novel situations.
  • Therefore in principle you shouldn't need your line operators to have wide experience or deep knowledge. If you have one knowledgeable problem-solver for every ten ignorant line operators, that should give you plenty of coverage for the number of problems you are actually likely to face and it's a lot more cost-effective than training everyone.
  • What's more, this arrangement means that your line operators are interchangeable human resources. You can move them wherever you need them in the organization. As long as they know how to follow procedures, you can use them to carry out any task that has been defined by a written procedure. And this gives you far more flexibility than you would have if they were tied to specific tasks because that's all they knew. This is what you want.
But look where this line of calculation takes us. By following the ordinary patterns of thought and planning that underlie modern industrial capitalism, we end up adopting a policy towards our people which has been proven to be very powerful, and which supports indefinite expansion; but this same policy makes our whole organization more fragile, and risks bringing us to our knees if something truly unusual happens.

How can this be? Is there something wrong with the theory?

Well yes, in a sense. Peter Drucker argued for years that our economy is no longer truly "capitalist" because Capital is no longer the most important factor of production. Capital is almost irrelevant these days, because it can be crowdsourced — either in a traditional manner, by issuing shares of stock; or in a contemporary manner, by launching a campaign on GoFundMe. The critical factor of production today, in Drucker's argument, is Knowledge; and the most critical member of any organization is the knowledge worker. (Drucker argued this point in many places but see for example his Post-Capitalist Society (1993).) A knowledge worker is any employee whose unique value comes from the knowledge he carries in his head. And because that knowledge is always of something specific, knowledge workers are in general not interchangeable. (If you have too many quality auditors, it is typically not easy to repurpose some of them as accountants or design engineers.)

Note also that the story above about the plating bath shows that even line operators can be knowledge workers. As a result, the whole approach of treating line employees as interchangeable units starts to look misguided or (at best) out of date.

None of this is to deny that a process focus really is very powerful in the short run. But if anything happens to interrupt normal operations — if, ... oh I don't know, ... say a global pandemic throws the Designated Problem-Solvers out of the office at the same time that it disrupts all the organization's supply chains — then an organization that has relied on a process-focus will be in deep difficulties, while an organization that has built up the competence of all its employees will be able to roll with the changes and adapt.

This development is something that we in the Quality business need to understand and pay attention to. We've heard the message before: W. Edwards Deming insisted in his fourteen key principles on the need for training on the job (point 6), for breaking down barriers between functions (point 9), for pride of workmanship (point 11), and for a "vigorous program of education and self-improvement" (point 13). But we have yet to integrate these concepts into the "common sense" understanding that all Quality professionals carry around with them. We have yet to rewrite our standards — like ISO 9001 — so they give as much attention to people as to processes.

We can do this. Once upon a time, we Quality professionals didn't all think in terms of statistical variation, but now we do. Once upon a time we didn't all think in terms of business processes, but now we do. We can absorb this change just like all the others. But we need to start.

__________

* The attentive reader will have noticed that I describe the very same action as resilient (as well as good for both the customers and the company) and a potential Major Nonconformity. How can it be both? Aren't audits supposed to improve the company's behavior? Or am I trying to criticize audits as counterproductive?

I'm not criticizing audits per se, but the usefulness of any audit depends critically on the usefulness of the management system documentation that you are auditing against. In this case, the root cause of the finding was the slapdash way that the company threw together their Control Plans, aiming to get something written so they could check a box rather than thinking through what the controls should really be. Since what the operator actually did to respond to the condition was correct, it should have been permitted as one option under a proper Control Plan. Or else perhaps the operator's deep knowledge of the process could have qualified him to be designated as a responsible "Manufacturing Engineer" for this particular production line.

In real life, the company analyzed the audit finding and realized that all their other Control Plans were probably just as bad. So they started over from the beginning and rewrote the lot of them more carefully. It was the best possible response to that finding, and I was glad that's what they chose.         

No comments:

Post a Comment

Quality and the weather

“ Everybody complains about the weather, but nobody does anything about it. ” The weather touches everybody. But most people, most of the ti...