Thursday, March 21, 2024

What did the FAA find?

It's all very well to sit snugly behind a keyboard and criticize Boeing's safety culture (as I have done in a number of posts this spring, for example here and here). But how much of this is just talk, and how much is based on hard data? Has anyone done the hard work to sit down with Boeing and study their culture in detail? Maybe an exercise like that could tell us something useful.

In fact, a special Expert Panel completed just such a study last month. These experts were appointed by the Federal Aviation Administration (FAA) and began to meet a year ago, at the beginning of March, 2023. They wrapped up their investigation in February 2024 after spending a full year on it. The team reviewed 7 surveys and more than 100 policies and procedures, comprising over 4000 pages. They interviewed more than 250 people across 6 locations. In the end they issued 27 findings and 53 recommendations. You can find the full report online here, and the New York Times has an article about it here

The report is devastating. 

More exactly, it's written in the bland bureaucratic language that is mandatory for reports like this. There are no bold headlines screaming "J'Accuse!" But I have been auditing since 1996, and I cannot remember ever reading—much less writing!—a report about a fully functioning organization* that painted in such broad strokes a picture of a management system floating so loose from its moorings.

Background and summary

The Expert Panel was formed in accordance with the provisions of the 2020 Aircraft Certification, Safety, and Accountability Act (ACSAA), Pub. L. 116-260, Div. V, § 103, which requires review of organizations that hold an Organization Designation Authorization (ODA) from the FAA. An ODA is the arrangement by which the FAA delegates certain Boeing employees to inspect Boeing's own work, on behalf of the FAA, so that the FAA does not have to assign their own people. The idea seems to be at least in part that there are a lot of inspections which are mandated by airworthiness regulations, and if all of them had to be carried out by FAA personnel then the FAA's staff and budget would have to be significantly increased. 

If you think it sounds crazy to ask a company to inspect its own work when there are serious safety risks at stake, … well, you can look up the text of the 2005 rule (70 FR 59932) establishing ODAs in the Federal Register; the "Background" section of that document explains how the idea grew incrementally over time as a way to cut down the long delays caused by airworthiness inspections. But the FAA still retains oversight of the whole process—naturally, right?—which is why the 2020 law referenced above requires all ODA holders explicitly "to adopt safety management systems (SMS) consistent with international standards and practices," and also directs the FAA "to review The Boeing Company’s ODA, safety culture, and capability to perform FAA-delegated functions." (Reference.)    

When the Expert Panel issued their report, they summarized their findings under four general headings:

  • Boeing's safety culture, where they found a "disconnect" between what they heard from senior management and what they heard from the rank and file;
  • Boeing's SMS, which was structured to reflect all the applicable standards perfectly but which appeared to have been glued on top of the organization with library paste;
  • Boeing's ODA management structure, which the Panel conceded had been recently reorganized to make it harder for the company to retaliate against an employee finding violations while acting in the name of the FAA (but "harder" still doesn't mean "impossible");
  • Other topics.

In the remainder of this post I will highlight and discuss some of the specific findings and other observations. (Sometimes I will indent my comments in blue, when I think it helps to distinguish my remarks from those of the Panel.)

Boeing's safety culture

The basic observation here is that Boeing has defined and rolled out a formal, written safety culture, but most employees don't really understand it. (Sec. 3.3) Concretely:

  • Many employees, when interviewed, didn't know about "Boeing's enterprise-wide safety culture efforts, nor its purpose and procedures." (Sec. 4.1, #1)
  • Even employees who knew the terminology of the safety culture couldn't use it in a sentence. (Sec. 4.1, #2)
  • Some Boeing sites have good, "confidential, non-punitive reporting systems" in place—but not all of them. (Sec. 4.1, #3)
  • Managers can investigate reports in their own reporting chain, which means they risk not being impartial. (Sec. 4.1, #4)
  • Employees don't know which reporting system to use for safety problems. Employees don't really trust any of the reporting systems, and prefer to report safety problems to their bosses. Employees especially don't trust the anonymity of the "preferred system." Employees do not (reliably) get informed of the outcome when they do report through these systems. (Sec. 4.1, #5)
    • My comment: When you first hear it, "reporting safety problems to your manager" doesn't sound like a bad idea. (Although naturally people who report problems should still hear back how they were dispositioned, or they'll start to think that reporting is a waste of time.) The reason that "reporting safety problems to your manager" can become a problem is that ….
  • When employees report safety problems to their managers, it's often done verbally. So there is no way to know if any particular problem ever made it into the reporting system. And if a problem didn't get into the system, there's no way to track whether it was ever analyzed or fixed. (Sec. 4.1, #6) 

Boeing's SMS

Grigory Potemkin,
architect of the system?
The Panel makes a number of high-level observations about Boeing's SMS, before diving into the details. Among these observations are the following:

  • All the SMS documents are new, and there is no traceability to the changes from what came before. (Sec. 3.4, para. 4)
  • Most of the SMS documents cover general conduct and do not translate to the concrete working level. (Sec. 3.4, para. 5)
  • Many employees don't really understand the elements of the SMS, or else they think it is a management fad that won't stick around. (Sec. 3.4, para. 10) 
  • Many employees point out that Boeing already had a detailed safety system before the SMS was implemented—so why do we need this new one now? (In fact the old system is still referenced in many procedure documents.) (Sec. 3.4, para. 11)
  • Boeing requires employees to take safety training classes, but doesn't test whether they learned anything. (Sec. 3.4, para. 13)

In other words, the Panel says that Boeing's shiny new SMS—which complies perfectly with all the relevant requirements and standards—is a Potemkin system

After those general observations, the specific findings might be an anticlimax, but here are a few of them:

  • The complexity of the SMS documentation, and "the constant state of document changes," make it hard for employees to understand it. (Sec. 4.2, #10)
  • Boeing uses an SMS dashboard to track safety goals, but employees (and some managers) don't understand what it is or how to use it. (Sec. 4.2, #12)
  • There are different tracking systems for the SMS and for the legacy safety systems, and many people are confused by them. (Sec. 4.2, #12, cont'd.)
  • Since Boeing has kept all the legacy safety systems in place, employees across the company don't trust that the new SMS will last long. (Sec. 4.2, #13)
  • Boeing has procedures on how to evaluate safety-relevant decisions, but there's nothing to explain how to tell which business decisions count as safety-relevant. (Sec. 4.2, #14)

In other words, employees don't understand the SMS and they have no motivation to learn it.

Boeing's ODA management structure

The Panel's general observation about the ODA program is that it is getting harder to fill, because participating inspectors (called Unit Members, or UMs) are retiring faster than new ones are being brought onboard. (Sec. 3.5, paras. 4-6; sec. 4.3, #18)

But the detailed findings have to do mostly with the risk that UMs could fear retaliation for speaking out about problems:

  • Boeing has not eliminated the possibility of retaliation when UMs raise safety concerns, and some UMs have experienced what looks like retaliation. Other UMs are not willing to help or step in, and their help is rejected as interference. (Sec. 4.3, #16)
  • Boeing says they took steps to make sure the ODA program is working correctly, but cannot provide proof. (Sec. 4.3, #17)
    • In which case, did they really do anything?
  • Supposedly Boeing has changed the ODA organizational structure, but nobody knows how. Employees still report to their old managers. Procedures are still written around the old structure. (Sec. 4.3, #19) 

There are some other smaller findings as well.

Other topics

Of the findings classified as "Other matters," the two that concern me the most state (in different ways) that input from pilots is treated inconsistently: if it comes into Executive A, it is treated seriously and addressed; but if it comes into Executive B, it might get lost or forgotten. (Sec. 4.4, #23 and #24) Less alarming are some technical points about how to handle the relationship between Boeing and the FAA in the future.

But a couple of the other general observations are worth noting.

Right at the beginning, Boeing welcomed the Panel and made sure to say that they looked forward to open collaboration. But the Panel says that in fact, Boeing answered questions rather as if the evaluation were an audit or a deposition, and asked for no input of any kind. (Sec. 2.6, paras. 12-13; sec. 3.2, para. 1)

So I have to ask, Did Boeing expect to learn anything from this evaluation? Or was the intent simply to get through it as fast as possible, with as few findings as possible? Because clearly, if you approach the whole exercise in a defensive frame of mind, you leave open fewer chances to learn and improve from the experience. 

Also interesting: the Board of Directors emphasized that they use safety-related performance metrics "when determining both Annual Incentive Pay and Long-Term Incentives." These metrics include, for example, "the requirement for executives to complete Boeing's Safety Management System training." This statement was intended to demonstrate Boeing's commitment to safety. (Sec. 3.7, paras. 5, 9, and 12)

The problem is, I think it demonstrates the reverse. Safety metrics in the bonus program? No! On the contrary, safety should be more important than any bonus program! Ironically, when you pay people for something, you cheapen it. At that point people start weighing one part of the bonus against another: Let's see, if I'm willing to give up a few dollars on safety, we can sell a lot more planes and by the end of the year the difference will more than make up for what I lost. Dollarizing the safety program is irresponsible if not worse. Safety should be non-negotiable, and paying people for it makes it negotiable. (I discuss this point in more detail in this post here.)

On the other hand, I understand why the Board of Directors would take this approach. To the man with a hammer, every problem looks like a nail. And it does seem like, in the last couple of decades, money is the hammer that Boeing's management has learned how to use. 



It's a long report. But I think it explains why Boeing has gotten into its present straits. From my point of view, the fundamental problems are all around system implementation. Boeing tried to create a new system, but went for the quick-n-easy approach rather than making sure the new system was fully implemented and integrated at all levels in the organization. As a result, people don't know what to do! Even people who want to do the right thing—and I firmly believe that this includes nearly everyone, nearly all the time—don't know how to do the right thing so that errors get caught, followed up, and fixed … and so that they themselves don't get in trouble for finding those errors in the first place.

Too much system can be as much a problem as not enough system. There's a balance and it always has to be pragmatic. I may have said this once or twice before now. 

__________

* I have participated audits that were meant as gap analyses, for organizations that wanted ISO 9001 certification and knew they weren't ready yet; and the results of those were often far worse than this one. But it was no surprise because the organizations knew in advance they had a lot of work to do.    

                

No comments:

Post a Comment

Quality and the weather

“ Everybody complains about the weather, but nobody does anything about it. ” The weather touches everybody. But most people, most of the ti...