Causal Analytics for Applied Risk Analysis Louis Anthony Cox, Jr

DECLG Department of the Environment Community and Local Government, March 9, 2012. New Smoky Coal Ban Regulations will bring Cleaner Air, Fewer Deaths and can help efficiency

Yüklə 3,36 Mb.
ölçüsü3,36 Mb.
1   ...   46   47   48   49   50   51   52   53   ...   57

DECLG Department of the Environment Community and Local Government, March 9, 2012. New Smoky Coal Ban Regulations will bring Cleaner Air, Fewer Deaths and can help efficiency,31034,en.htm. Last Retrieved 1 February 2014.
Delquié P, Cillo A. 2006. Disappointment without prior expectation: a unifying perspective on decision under risk. Journal of Risk and Uncertainty Dec; 33 (3): 197–215. doi:10.1007/s11166-006-0499-4.
Djulbegovic B, Kumar A, Magazin A, Schroen AT, Soares H, Hozo I, Clarke M, Sargent D, Schell MJ. 2011. Optimism bias leads to inconclusive results-an empirical study. J Clin Epidemiol. Jun;64(6):583-93. doi: 10.1016/j.jclinepi.2010.09.007.
EPA, 2011a. The Benefits and Costs of the Clean Air Act from 1990 to 2020: Summary Report. U.S. EPA, Office of Air and Radiation. Washington, D.C.
EPA, 2011b. The Benefits and Costs of the Clean Air Act from 1990 to 2020. Full Report. U.S. EPA, Office of Air and Radiation. Washington, D.C.
Epper T., Fehr-Duda H. 2014. The missing link: Unifying risk taking and time discounting. Last Retrieved 14 February 2014.
Feldman AM.  2004. Kaldor-Hicks compensation.   In P. Newman (Ed), The New Palgrave Dictionary of Economics and the Law. Volume 2, E-O.  417-412.
Gan HK, You B, Pond GR, Chen EX. 2012. Assumptions of expected benefits in randomized phase III trials evaluating systemic treatments for cancer. J Natl Cancer Inst. Apr 18;104(8):590-8. doi: 10.1093/jnci/djs141.
Gardner, D. (2009). The Science of Fear: How the Culture of Fear Manipulates Your Brain. Penguin Group. New York, New York.

Gelfand S. Clinical equipoise: actual or hypothetical disagreement? J Med Philos. 2013 Dec;38(6):590-604. doi: 10.1093/jmp/jht023. Epub 2013 Jul 22.

Gilboa I, Schmeidler D. 1989. Maxmin expected utility with a non-unique prior. Journal of Mathematical Economics, 18:141–153.
Goodman SN. Of P-values and Bayes: a modest proposal. Epidemiology. 2001 May;12(3):295-7. No abstract available.
Graham DA. 1981. Cost-benefit analysis under uncertainty. The American Economic Review 71(4): Sep., 715-725 Last Retrieved 1 February 2014.
Grossman PZ, Cearley RW, Cole DH. 2006. Uncertainty, insurance, and the Learned Hand formula. Law, Probability and Risk 5(1):1-18
Harford T. (2011) Adapt: Why Success Always Starts with Failure. Farra, Straus and Giroux. New York, New York,

Hart A. 2005. Adaptive heuristics. Econometrica, 73(5):1401-1430.

Harvard School of Public Health, 2002. Press Release: “Ban On Coal Burning in Dublin Cleans the Air and Reduces Death Rates”

Hazan E, Kale S. 2007. Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria. Advances in Neural Information Processing Systems 20: 625-632

Health Effects Institute (HEI). 2013. Did the Irish Coal Bans Improve Air Quality and Health? HEI Update, Summer, 2013. Last Retrieved 1 February 2014.

Hershey JC, Kunreuther HC, Schoemaker PJH. 1982. Sources of Bias in Assessment Procedures for Utility Functions. Aug Management Science 28(8): 936-954.

Hoy M, Peter R, Richter A. 2014. Take-up for Genetic Tests and Ambiguity. Journal of Risk and Uncertainty, 48 (forthcoming)

Hylland, Aanund & Zeckhauser, Richard J, 1979. "The Impossibility of Bayesian Group Decision Making with Separate Aggregation of Beliefs and Values," Econometrica, Econometric Society, vol. 47(6), pages 1321-36, November.
Ioannidis JPA. Why most published research findings are false. 2005. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124

Jaksch T, Ortner R, Auer P. Near-optimal regret bounds for reinforcement learning. Journal of Machine Learning Research. 11(2010): 1563-1600.

Josephs RA, Larrick RP, Steele CM, Nisbett RE. Protecting the self from the negative consequences of risky decisions. J Pers Soc Psychol. 1992 Jan;62(1):26-37.

Kahneman D. Thinking Fast and Slow. 2011. Farrar, Straus, and Giroux. New York, New York.

Kahneman D, Frederick S. 2005. A model of heuristic judgment. In K.J. Holyoak & R.G. Morrison

(Eds.), The Cambridge handbook of thinking and reasoning (pp. 267-293). New York: Cambridge

University Press.
Kahneman D, Knetsch JL, Thaler RH. 1991. Anomalies: The Endowment Effect, Loss Aversion, and Status Quo Bias. The Journal of Economic Perspectives, 5(1), pp. 193-206, Winter 1991
Kahneman D, Tversky A. 1984. Choices, values and frames. American Psychologist, Apr. 39:341-350.
Kahneman D, Tversky A 1979. Intuitive prediction: biases and corrective procedures. TIMS Studies in Management Science 12: 313–327.
Keren G, Gerritsen, LEM. 1999. On the robustness and possible accounts of ambiguity aversion Acta Psychologica, 103(1-2):0149 - 172
Kralik JD, Xu ER, Knight EJ, Khan SA, Levine WJ. 2012. When less is more: Evolutionary origins of the affect heuristic. PLoS ONE 7(10): e46240. doi:10.1371/journal.pone.0046240
Lehrer J. Trials and errors: Why science is failing us. Wired. January 28, 2012.

Li J, Daw ND. Signals in human striatum are appropriate for policy update rather than value prediction. J Neurosci. 2011 Apr 6;31(14):5504-11. doi: 10.1523/JNEUROSCI.6316-10.2011.

Louis, P. 2009. “Learning aversion and voting rules in collective decision making,” mimeo, Universitat Autonoma de Barcelona.
Loomes G, Sugden R. 1982. Regret theory: an alternative theory of rational choice under uncertainty. The Economic Journal Dec 92(368): 805-824
Man PTY, Takayama S. 2013. A unifying impossibility theorem. Economic Theory 54(2):249-271.
Maccheronia F, Marinacci M, Rustichini A. 2006. Ambiguity aversion, robustness, and the variational representation of preferences. Econometrica. Dec. 74(6): 1447-1498.
Matheny ME, Normand SL, Gross TP, Marinac-Dabic D, Loyo-Berrios N, Vidi VD, Donnelly S, Resnic FS.Evaluation of an automated safety surveillance system using risk adjusted sequential probability ratio testing. BMC Med Inform Decis Mak. 2011 Dec 14;11:75. doi: 10.1186/1472-6947-11-75.
Mueller DC. Public Choice III. 2003. Cambridge University Press. New York.
Navarro AD, Fantino E. The sunk cost effect in pigeons and humans. J Exp Anal Behav. 2005 Jan;83(1):1-13.
Nehring K. The Impossibility of a Paretian Rational: A Bayesian Perspective, Economics Letters. 96(1):45-50. 2007.
Newby-Clark IR, Ross M, Buehler R, Koehler DJ, Griffin D. 2000. People focus on optimistic scenarios and disregard pessimistic scenarios while predicting task completion times. J Exp Psychol Appl. Sep; 6(3):171-82.
Nuzzo R. 2014. Scientific method: Statistical errors. P values, the 'gold standard' of statistical validity, are not as reliable as many scientists assume. Nature 506, 150–152 doi:10.1038/506150a
Othman A, Sandholm T. (2009) How pervasive is the Myerson-Satterthwaite impossibility? Proceedings of the 21st International Joint Conference on Artifical Intelligence, IJCAI'09: 233-238. Morgan Kaufmann Publishers Inc. San Francisco, CA, USA
Pelucchi C, Negri E, Gallus S, Boffetta P, Tramacere I, La Vecchia C. 2009. Long-term particulate matter exposure and mortality: a review of European epidemiological studies. BMC Public Health. Dec 8;9:453.
Politi MC, Clark MA, Ombao H, Dizon D, Elwyn G. Communicating uncertainty can lead to less decision satisfaction: a necessary cost of involving patients in shared decision making? Health Expect. 2011 Mar;14(1):84-91. doi: 10.1111/j.1369-7625.2010.00626.x. Epub 2010 Sep 23.
Portney PR. 2008. Benefit-Cost Analysis. In Henderson DR (Ed.), The Concise Encyclopedia of Economics. Library of Economics and Liberty. Last Retrieved 1 February 2014.
Poundstone, W. 2010. Priceless: The Myth of Fair Value (and How to Take Advantage of It). Scribe Publications.
Pham MT, Avnet T. 2009. Contingent reliance on the affect heuristic as a function of regulatory focus. Organizational Behavior and Human Decision Processes 108: 267–278. Last Retrieved 14 February 2014.
Prelec D, Loewenstein GF. 1991. Decision Making over Time and under Uncertainty: A Common Approach. Management Science, 37(7):770–786
Robards M, Sunehag P. 2011. Near-optimal on-policy control.
Rothman KJ (1990) No adjustments are needed for multiple comparisons. Epidemiology 1: 43–46.
Russo JE, Schoemaker PJH. 1989. Decision Traps: Ten Barriers to Brilliant Decision-Making and How to Overcome Them. Doubleday. New York.
Russo JE, Schoemaker PJH. 1992. Managing overconfidence. Winter. Sloan Management Review 33(2):7-17.
Schönberg T, Daw ND, Joel D, O'Doherty JP. Reinforcement learning signals in the human striatum distinguish learners from nonlearners during reward-based decision making. J Neurosci. 2007 Nov 21;27(47):12860-7.
Saito K. 2011a. A relationship between risk and time preferences.
Saito K. 2011b. Strotz Meets Allais: Diminishing Impatience and the Certainty Effect: Comment. The American Economic Review 101(5): 2271-2275
Sarewitz D. Beware the creeping cracks of bias. Nature. 10 May 2012 485:149
Slovic P, Finucane M, Peters E, MacGregor, DG. 2002. Rational actors or rational fools: implications of the affect heuristic for behavioral economics. The Journal of Socioeconomics, Volume 31, Number 4, 2002 , pp. 329-342(14)
Slovic P, Finucane M, Peters E, MacGregor D. 2004. Risk as analysis and risk as feelings: some thoughts about affect, reason, risk, and rationality. Risk Analysis 24 (2): 311–322.
Smith JE, von Winterfeldt D. Decision Analysis in "Management Science." Management Science 2004 May;50(5):561-574.
Stokey NL. 2009. The Economics of Inaction: Stochastic Control Models with Fixed Costs. Princeton University Press. Princeton, NJ.
Sunstein C. 2005. Moral heuristics. Behavioral and Brain Sciences 28: 531–573
Thaler RH. 1999. Mental accounting matters. Journal of Behavioral Decision Making 12: 183-206
Treasury Board of Canada Secretariat. 1988. Benefit-Cost Analysis Guide (Draft). Last Retrieved 1 February 2014.

Tversky A, Thaler RH. 1990. Anomalies: Preference Reversals. Journal of Economic Perspectives 4(2): 201-211.
Wittmaack K. The big ban on bituminous coal sales revisited: serious epidemics and pronounced trends feign excess mortality previously attributed to heavy black-smoke exposure. Inhal Toxicol. 2007 Apr;19(4):343-50.
Yu JY, Mannor S, Shimkin N. (2009). Markov decision processes with arbitrary reward processes. Mathematics of Operations Research August 34(3):737-757

Chapter 13

Improving Risk Management: From Lame Excuses to Principled Practice
Three classic pillars of risk analysis are risk assessment (how big is the risk and how sure can we be?), risk management (what shall we do about it?), and risk communication (what shall we say about it, to whom, when and how?) Chapter 1 proposed two complements to these: risk attribution (who or what addressable conditions actually caused an accident or loss?) and learning from experience about risk reduction (what works, how well, and how sure can we be?) Failures in complex systems usually evoke blame, often with insufficient attention to root causes of failure, including some aspects of the situation, design decisions, or social norms and culture. Focusing on blame, however, can inhibit effective learning, instead eliciting excuses to deflect attention and perceived culpability. Productive understanding of what went wrong, and how to do better, thus requires moving past recrimination and excuses.

This chapter, which is a slight update of Paté‐Cornell and Cox (2014), identifies common blame-shifting “lame excuses” for poor risk management. These generally contribute little to effective improvements and may leave real risks and preventable causes unaddressed. We propose principles from risk and decision sciences and organizational design to improve results. These start with organizational leadership. More specifically, they include: deliberate testing and learning – especially from near-misses and accident precursors; careful causal analysis of accidents; risk quantification; candid expression of uncertainties about costs and benefits of risk-reduction options; optimization of trade-offs between gathering additional information and immediate action; promotion of safety culture; and mindful allocation of people, responsibilities, and resources to reduce risks. We propose that these principles provide sound foundations for improving successful risk management.

Why do catastrophes happen? Bad luck is rarely the whole answer
Even excellent risk management decisions usually do not usually reduce risk to zero. Flawless risk management decisions typically do not eliminate all risk. However, failures of large systems and operations, such as the Challenger space shuttle disaster in 1986, the core breaches at the Fukushima Daiichi reactors in 2011, or the fire at the BP Mobile Drilling Unit Deepwater Horizon also in 2011, are often rooted in flawed decision-making at high levels of the organization, with disregard or poor use of available information and of the actual effects of the incentive structure (see Chapter 12; Paté-Cornell, 1990; Murphy and Paté-Cornell, 1996). Human, organizational, and management factors that predispose to human errors and operation failures can and should be addressed explicitly in a systems analysis to support risk management decisions and reduce accident risks. These decisions in the design, manufacturing, construction and operation of complex systems, affect their longevity and failure risk as well as their daily performance and productivity. Too often, however, catastrophic failures lead to a tight focus, after the fact, on assigning blame, and to expensive litigation over who knew (or should have known) what and when, and who should have done things differently. People design decision rules or operating practices, attract blame and tend to be replaced (Harford, 2011). Although hindsight bias may add asperity to the prosecution, the defense frequently finds that questionable (or ridiculous) arguments have been advanced for why things were done as they were, often in an effort to deflect blame from those who actually made the basic decisions.

Post hoc assignment of blame is prominent in our culture and in our justice system. It provides material for daily news and discussion:political spin on who is to blame for flooding and power outages in the wake of Hurricane Sandy; decisions about whether scientists should be jailed in Italy for opining, following small tremors, that earthquake risks in L’Aquila were modest if not negligible right before a lethal one struck (International Seismic Safety Organization, 2012); anddebate and recriminations over why Ambassador Stevens’ calls for help did not prompt additional actions to save his life in Benghazi. More generally, the question is why precursors and near-misses were systematically ignored or misread, as they were for instance, during drilling of the Macondo well (NRC, 2012; NAE, 2004).

Psychologists, however, have convincingly documented that both prospective risk assessment and retrospective blame assignment are often flawed by various heuristics and biases (e.g., Kahneman, 2011), as discussed in Chapter 12. In assessing risky prospects and projects, several classic fallacies have been identified in the literature. People systematically tend to over-estimate benefits and under-estimate costs (the planning fallacy)3. Optimistic bias leads some to accept risky prospects that they might reject if their expectations and perceptions were more accurate (illusion of control and overconfidence). In retrospect, they may believe that whatever happened was inevitable (hindsight bias). They may prefer to continue with systems and assumptions in which large investments have already been made, rather than acknowledging, in light of new experience, that they are flawed and should be updated (sunk-cost bias).They may also blame bad luck and other people for undesired outcomes while attributing success to their own skill and efforts (self-serving bias), and altogether distort their understanding of what happened and why. Yet, identifying mistakes and their fundamental causes after a failure or a near-miss is key to learning effectively about what went wrong and how to do better in the future (Paté-Cornell, 2009). Therefore, to learn from costly experience how to improve risk management, it is essential to do realistic post-mortems, and not to let the opportunities for learning dissipate in a cloud of evasion and misdirection. Accordingly, this paper focuses on a set of well-worn “lame excuses” often advanced to justify the decisions and behaviors that preceded catastrophic failures of complex systems. It then proposes some principles to improve risk assessment and management by cutting through these excuses to identify needed changes in design, operations, and culture.

It’s not our fault”: Some common excuses for bad risk management
Many arguments advanced to deflect the blame for conspicuous failures are based on claims of unpredictability: “it was a black swan”, or extremely low probability and not reasonably foreseeable; “it was a perfect storm,” or extremely rare conjunction of conditions; and so forth (Paté-Cornell, 2012). More generally, they are attempts to put the blame elsewhere, claiming that other people, nature, or even supernatural influences (bad luck or an Act of God) were responsible. Excuses in this category are seldom justified. The failure might well be rooted, for example, in flawed procedures as might be expected, in case of poor monitoring, from the theory of principal-agent problems (Garber and Paté-Cornell, 2012).

  • “It was not our job: we were not paid for it”, or “it was not our responsibility to report the deterioration of the structure or design errors that threaten its integrity.”

This kind of reasoning can be attributed to poor safety culture and the detachment of the individuals from the proper functioning of an organization or a system. Misplaced faith in the status quo encourages us to accept how things are and how work gets done until something breaks badly enough to force us to recognize that we should have made changes sooner (Hammond et al., 1998).

  • “It was an act of God”, implying that natural forces were involved that are uncontrollable and therefore the blame cannot be put on any human being.

Of course, this is a fallacy if human choices createthe exposure to the risk and determine a system’s capacity to survive naturally occurring stresses. Some of the initial claims regarding the accident at the Fukushima Daiichi nuclear power plant belong in that category. Earthquakes of magnitude greater than 8 (and the large tidal waves that came with them) had occurred several times in recorded history (Epstein, 2011), but the siting of the reactor and the initial choice of a 5.7 meter tsunami design criterion were human decisions.

    • “It was a black swan” meaning that it was both unprecedented and unimaginable before the fact.

This one has become a favorite excuse after Taleb’s 2007 book comparing unimaginable (or, at least, unanticipated) events to the discovery of black swans in the 17th century by Dutch sailors who, until then, had seen only white ones (Taleb, 2007). Intended by the author to explain financial crises that only a few had seen coming, it has become a much-used descriptor for events such as the attack on the US on 9/11/2001. Yet, an attack on the world trade center had occurred in 1993 and FBI agents had detected suspicious flying training in the preceding months4. With preparation, vigilance, and effort, much more is reasonably foreseeable than might be expected (Russo and Schoemaker, 1990).

Deliberate exercises in applying “prospective hindsight” (i.e., assuming that a failure will occur in the future, and envisioning scenarios of how it could happen) and probabilistic analysis using systems analysis, event trees, fault trees and simulation can be used to overcome common psychological biases that profoundly limit our foresight (Russo and Schoemaker, 1990). These include anchoring and availability biases (Kahneman, 2011; Kahneman and Tversky, 1974), confirmation bias, group-think (Janis, 1984), status quo bias, or endowment effects (Russo and Schoemaker, 1990; Hammond et al., 1998). In the case of 9/11, for example, a similar terrorist attack had been mounted against an Air France flight in 1994 but thwarted by French security forces before it reached Paris. Yet, the experience had faded in the status quo of 2001.

    • “It was a perfect storm”, i.e., it required such a rare conjunction of unlikely events that it seemed that it could be safely ignored.

This phrase became popular after the publication of a book and the release of a movie describing a convergence of several storms in the Northern Atlantic in 1991, in which a ship sank and the crew perished. Such conjunctions of unusual conditions, although individually rare, are collectively quite common (Paté-Cornell, 2012); but their probabilities are often underestimated because dependencies are unrecognized or misunderstood. In technical systems, these conjunctions sometimes come from common causes of failure such as external loads (e.g., extreme winds) or human errors that affect the whole system (e.g., flawed maintenance of all engines of an aircraft). They happen regularly in the economic sector and in supply chains, for instance, if difficult market situations and lean inventories are compounded by a natural catastrophe (independent events in this case). They are even more likely to occur in the financial industry or other tightly coupled organizations where the failure of one institution may have devastating effects on related ones. This may occur for instance, because the failure factors are economically and statistically dependent (risk “contagion”), or because psychological reactions to one event are likely to cause further failures (bank runs).

    • “It never did that before” (e.g., “the system had not blown up yet in previous near-misses, so, we thought that we were fine”).

Ignoring near-misses and considering them successes is a classic reaction. Indeed, responding effectively to prevent near-misses from developing into full-blown catastrophes may reflect competence in managing hazardous situations and be a justifiable source of pride and confidence5. Yet, interpreting near-misses as justification for complacency can make one miss potentially valuable lessons. This was the case, for example, of tire blowouts on the SST Concorde, which had happened 57 times before a piece of rubber from one of them punctured the fuel tank and caused the death of everyone on board in July 2000 (BEA, 2001)6.

    • “Those in charge did not know, and could not reasonably have known, what was about to happen”.

Excusable ignorance is a common plea in organizations that fail to pass messages, especially bad news, from the front lines to decision makers higher in the hierarchy. Indeed, “plausible deniability” is sometimes sought as a way to deflect responsibilities from top decision makers. Clearly, many signals are gathered every day and organizations need some filters to function effectively (Lakats and Paté-Cornell, 2004). Incentives sometimes are such that an agent can rationally take shortcuts to meet a resource constraint rather than bringing the problem to the attention of the principal (Paté-Cornell and Garber, 2012). Ineffective elicitation and use of the information known to the members of an organization is both common and costly when the objective should be to align the goals of the employees and those of the organization. Therefore, changing incentives and procedures of deliberations and decisions can do much to elicit, exploit, and reward information that might otherwise remain hidden (Russo and Schoemaker, 1989).

    • “We did not know that things had changed so much”

Status quo bias lulls into assuming that things will remain what they are: the environment will not change and our system will remain what it is. This is seldom true. A general failure to monitor the situation and its evolution (including for instance markets, competitors and employees’ performance) and to disregard or misinterpret signals that a new one is looming is a natural tendency (Russo and Schoemaker, 1989). This change in business environment is at the core of enterprise risk management and critical in these times of globalization and quick emergence of new technologies.

    • “It was permitted and convenient, so we did it.”

This was the case for instance, of the design of the Piper Alpha platform, where, against common sense, the control rooms had been located above the production modules (Paté-Cornell, 1993) for the convenience of operators who could easily go from one to another. Yet, an explosion in the production modules could (and eventually did) destroy possibilities of controlling the situation.

    • “We took every reasonable precaution and followed standard operating procedure.”

This would be a convincing excuse if the precautions and standard operating procedures were effectively applied to the situations for which they were intended. Yet, as potential catastrophes begin to unfold, the system, the environment, or the situation may changein ways that make the standard procedures and precautions inadequate. Blind rule-following may be disastrous if the assumptions behind the rules no longer hold. Deterioration, for example, affects automobile safety as well as that of airplanes, especially when maintenance on schedule fails to address some obvious parts such as the fuselage of an aircraft7 and when there is not sufficient latitude for maintenance on demand. A quick shift from standard to crisis operation mode and creative improvisation to respond to the new and unforeseen situation may then be essential to avert a disaster (Harford, 2011).

    • “Everybody does it”

The everybody-does-it-defense, commonly used by teenagers, implicitly assumes that it is no one’s obligation to examine current status quo practices and their implications, especially in a changing environment (Hammond et al., 1998) and that imitating others justifies one’s actions. Heedless imitation and herd-following behavior can, of course, multiply the consequences of failure if tight technological couplings make imitation easy and reflection more difficult. This is the case, for instance, of computer reactions to financial market situations, where automatic trading platforms and rules amplify the effects of initially minor price fluctuations. Similarly, destructive memes of harmful behaviors (whether teenage hyperventilation, recreational drug use or copycat crimes) can spread rapidly through social media. Imitative learning is a powerful force of social coherence, but thoughtless imitation, as well as following orders without questioning their ethics and consequences can destabilize systems, spread destructive habits, and amplify risks.

    • “All signals indicated that we were doing fine.”

Good test results may be falsely reassuring if the tests wereperformed in the wrong environment, or the sample size was too small. Over-confidence in biomedical circles has become so prevalent that commentators are starting to worry that science is failing us (Lehrer, 2012). For lack of operating experience, engineering and physical models as well as expert opinions may be needed to complete the picture. Besides, “we used only the best/most credible/reliable results” may reflect a self-serving selection bias8.

    • “But everyone agreed that it seemed a good idea at the time!”

A common reaction to failure is that everyone agreed beforehand to the course of action. “Our best experts approved, our reviewers and stakeholders (Congress, clients, funders, public, etc.) loved our analysis, our models were detailed and coherent, the results were perfectly clear it all seemed to make sense”. In reality, such consensus may reflect “groupthink” and mutual influence among the players (Janis, 1984). That clients receive analytical results that fit their interest may be the result of the incentives or the information that they have given to the analysts. These results are thus directly affected by motivated reasoning, confirmation and interpretation bias, and premature closure.

In practice, full validation of a risk analysis model in the traditional statistical sense may be impossible for new systems or systems that have changed because the statistics at the system level are not yet available. Yet, these models can be justified based on what is known of the functions and dependencies of the various components, and of the external factors (loads, environment) that affect their robustness.

  • “It was an operator error”

Blaming the personnel in charge at the lower level is sometimes convenient to an organization. For instance, more than 60% of recent airline accidents have been blamed on pilot errors (FAA, 2013). Yet, in some cases, a system design may be the accident root cause and must be corrected to allow for some pilot mistakes before they cause a crash. In other cases, the pilots may not have been sufficiently trained, for instance, to understand the signals that they get from electronic monitors or to operate in absence of these signals. Similarly, in the medical field, some accidents are directly caused by residents who do not have sufficient experience; but the true responsibility may lie with the supervisors if they are 15 minutes away from the operating room when they should be accessible in two minutes. More generally, managers sometimes blame the operators for their own failures when the leadership did not provide the training, the supervision, the information or the incentives required to face critical situations.

Foundations for Better Risk Management
Valuable lessons about risk reduction can be derived from these accidents and from the excuses that are invoked after the fact. This section proposes some constructive foundations for improved risk management practices. They are selected from the management science and risk management literatures and reinforced by the cases described earlier. As witnessed by the extensive literatures on improving organizational decision-making (Russo and Schoemaker, 1989) and building high-reliability organizations (Weick and Sutcliffe, 2007), we are far from the first to suggest principles and practices for overcoming the foregoing limitations. But we believe that the following recommendations, which emphasize causal understanding, quantitative methods, and deliberate design for learning, can add value to earlier improvement efforts.
Understand the Causes of the Hazard, then its Potential Effects
“Acts of God” such as earthquakes, tsunamis or hurricanes often have a history, and their mechanisms and recurrence times can be at least partly understood. There is no more valuable tool for reducing risk than correctly analyzing and understanding causes (Paté-Cornell, 1993; Cox, 2013). This requires identifying the factors affecting the performance of people and systems, and their technical characteristics, as well as the environment in which they operate. It is essential, in particular, to understand who can control what, and how incentive and information structures affect agents’ decisions, response times, and error rates (e.g., Murphy and Paté-Cornell 1996). In the oil industry, for instance, rewarding production alone will likely discourage interrupting operations when immediate maintenance is needed. A general nonchalant attitude towards safety can be corrected by training, changing incentives, and making top managers aware of the true costs of risk and of opportunities to manage them more effectively. Because risks are often invisible until they have struck, it is an easy and common practice to dismiss them or to leave them to actuaries and insurers to price and manage their financial side. The human costs, however, cannot be truly redressed after the fact.

Risk analysis can make the cumulative impacts of risks on a company and its employees, customers, and partners more vivid. It can quantify, with emphasis on uncertainties, avoidable possible losses as well as potential changes in insurance premiums and cost of capital, and can highlight cost-effective opportunities for enterprise risk management. Other risk factors are simply facts that can only be accounted for. The realities of shrinking global markets may not be changeable, but some level of diversification, innovation, and decoupling meant to protect a system from cascading failures may go a long way towards reducing risks.

More complex cases are those in which the risk is caused in large part by the activities or the threats of intelligent adversaries such as drug gangs or insurgents. The key to analyzing the risk that they present is to understand who they are, their intent, their capabilities and the types of response that one is willing to implement given the possibilities of deterrence but also escalating conflicts (Paté-Cornell, 2012). The issue here is thus to address not only the symptoms (e.g., immediate threats of attacks) but also the basic problems, although sometimes, as in saving a patient, treating the symptoms may have to come first9.
Characterize Risk Magnitudes and Uncertainties
Once a hazard – a possible source of risk – is identified, the next step is to try to figure out what one is dealing with and how large the risk might be (Kaplan and Garrick, 1981). Are probabilities and magnitudes of potential losses large enough, compared to the costs of reducing them, to warrant urgent attention, or are they small enough that waiting to address them implies little possible loss? This is where science reporters often fail as risk communicators, by publishing articles exclaiming that exposures have been “linked” to adverse effects, but without noting the absolute sizes of the risks involved or, often, even failing to check whether the claimed “links” are causal (Gardner, 2009). If the risk is uncertain, can this uncertainty be clarified for a cost that is less than the value of information obtained by additional investigation, and the benefits of improved decisions that it would make possible10? If so, acting quickly out of concern about uncertain risks may be less prudent than first collecting better information.

This quantification of the risk and of the associated uncertainties may be a difficult task depending on the nature and the relevance of the available evidence. Quantitative risk assessment (QRA) or Probabilistic Risk Analysis (PRA), developed originally for engineered systems, involve all available information that can help to answer practical risk management and uncertainty reduction questions. These methods are based both on systems analysis and on probability, including essential functions, feedback loops and dependencies caused by external events and common causes of failure (Paté-Cornell, 2009). For a structural system, these external events can be earthquakes or flooding that affect simultaneously several subsystems and components. In these cases, the risk analysis is based on an assessment of the probability distributions of loads and capacities, and computation of the chances that the former exceeds the latter.When needed, the results should include, and if needed display separately, the effects of aleatory as well as epistemic uncertainties11 to accurately characterize the limitations of the analytical results.

Realistic PRAs, including those for failures of complex technological systems,must also include human and organizational factors. This analysis can be achieved, starting from a functional and probabilistic analysis of system failures, then considering the potential decisions and actions of the actors directly involved (errors as well as competent moves), andlinking these to the environment created by the management (Paté-Cornell and Murphy, 1996). This requires examining in details the procedures, the structure and the culture of the organization, including the information passed along to operators, the resource constraints and the incentive system.

These risk analyses, imperfect as they are, can be invaluable tools in identifying risks that were not considered or were underestimated before12, and in setting priorities among safety measures.

Identify Possible Risk Reduction Measures and Candidly Assess their Costs, Benefits, And Uncertainties
For risks that are worth acting on now, the next step is to identify the risk mitigation alternatives and challenges in an implementing them, and to assess how much difference they would make. This is an essential step in rational (“System 2”) thinking, which is often entirely missing from emotional and intuitive (e.g., outrage-driven or amygdala-mediated) “System 1” responses to risk (Sanfey and Chang, 2008). Our emotions, often based on recent events that have been widely advertised, may tell us that a situation is unacceptable and urge us to adopt a change to address the problem (“Ban that product!”). Indeed, the “precautionary principle” to implement such bans systematically when there remain uncertainties has been adopted by some governments (European Commission, 2000). Yet, reasonable (if imperfect) calculations of how much difference alternative interventions would actually make are needed to guide risk management actions to achieve desired results.

The challenge, again, is to ensure that these assessments are as objective as humanly possible. Algorithmic techniques such as those in Chapter 2 may help. Separating facts and values may sometimes require that an analyst waste no time working for someone who will disregard fact-based results, or who insists onconstraining or influencing them based on values and preconceptions, for instance by forcing some inputs. For example, if the U.S. EPA required its experts to express their uncertainty about “lives saved per microgram per cubic meter” of reduction in fine particulate matter by using Weibull distributions, which are constrained to show a 100% probability of positive life savings (no matter what the data say), then analysts might insist on being given the flexibility to use other distributions that could also assign positive probability to zero (or negative) values if that is what the data indicate (Cox, 2012). An illustration of the “risk of no risk analysis” is, again, the choice of a surprisingly low tsunami design criterion at the Fukushima Daiichi nuclear reactor, despite a recorded history of such events over more than a thousand years as mentioned earlier. Insisting that risk management be guided by risk analyses is particularly critical for new nuclear reactors, whose design criteria must meet the characteristics of each site and the local hazards of external loads at a time where 68 new nuclear power plants are under construction across the world.

Assess the Urgency of Immediate Action and the Value of Information
Is collecting (or waiting for) additional information before acting more costly than it is worth? The value of gathering new information depends on the possibility that it will permit better (e.g., higher expected-utility) decision making. It therefore depends on the uncertainties faced by the decision maker, as well as his or her risk attitude (Howard, 1966). When deciding the urgency of action and evaluating whether to wait for additional information, a risk manager should consider:

  1. Is the system stable? If not, how quickly is it deteriorating?

  2. Are the benefits of gathering or waiting for additional information,which might improve the decision,expected to outweigh the costs?

  3. What does one know (and can expect) of new technologies that may allow elimination of the risk altogether, for instance by replacing a hazardous technology at an acceptable cost?

An example of the first consideration –deterioration – is the speed at which one might expect, for instance, deterioration of the climate with and without proposed interventions, with an assessment of its likely impacts (both beneficial and harmful) on human population in different parts of the globe. Examples of the second consideration –value of information and optimal stopping – include the choice of whether to perform additional medical tests before an operation, whether to engage in more invasive medical tests on a routine basis, or whether to delay a repair in a car or a chemical factory to ensure that the potential risk reduction benefits justify the costs of an immediate fix. An example of the third type – risk reduction by the substitution of a new technology – might be the decision to live with the consequence of coal burning to generate electricity, after closing nuclear plants andbefore solar energy becomes truly economical, understanding the pace and the costs of such new development and the actual potential for future risk reduction.

Anticipate, Monitor, and Prepare for Rare and Not-So-Rare Events
Not-so-rare events can generally be analyzed relatively easily because there is a base of experience, either with the system itself (e.g., classic earth dams) or with its components and subsystems (e.g., an aircraft that has been in service for decades, so that there is substantial operating experience with its subsystems and components). Rare events that result from the conjunction of known components (“perfect storms”) with or without dependencies may be a bit more difficult to analyze if either the probabilities or the dependencies among them are difficult to establish.

Rare or unknown events for which there is little or no information as to whether or not they can actually occur are especially difficult to manage sensibly. Starting with the most difficult case, genuine “black swans” that one knows nothing about and cannot reasonably anticipate, the best strategy may be to monitor for signals of unusual occurrences (e.g., of new diseases) and to put in place a “resilient” structure of organizational connections, financial reserves and access to human intelligence and knowledge that allows for quick, creative local responses (Paté-Cornell, 2012; Cox, 2012b). For instance, new types of flu occur on average every two years. A system managed by the World Health Organization (WHO) permits monitoring and sharing of information across countries and identification of virus types. Although imperfect, that system allows relatively quick response to new strains of flu viruses such as H1N1. But the slow response to the spread of AIDS illustrates the difficulty of identifying and responding to a new type of pathogen13. Managing the more straightforward case of “perfect storms” is easier in that it involves “anticipating the unexpected” but imaginable (Augustine, 1996), and observing conjunctions of dangerous events such as the convergence of storms, loads on a system, or economic problems.

Deliberately Test and Learn
As detailed in Chapter 12, an avoidable pitfall in organizational risk management is to fail to deliberately acknowledge and test key assumptions, to learn from experience and to capture data and lessons for future reference as opportunities arise (Russo and Schoemaker, 1989). The world is full of “natural experiments” – unplanned but potentially highly informative shocks to systems or changes in conditions over time, as in Chapter 10 – which can be used to test and refine critical assumptions underlying risk assessment and risk management… if we remember to do so. For example, if air pollution in Beijing during a winter inversion soars to dozens of times higher concentrations than are permitted in the U.S., but mortality rates do not increase correspondingly, the common risk assessment assumption that life expectancies decrease in direct proportion to pollution concentrations (Pope et al., 2009; Correia et al., 2012) should be revisited in light of the new data. Nor, outside the domain of human health, is it always necessary to wait for natural experiments. Intelligence and security professionals know that deliberately testing their systems (e.g., by “red teaming,” which grew more popular after 9/11) and trying to bypass or disable safeguards is a key to active identification and elimination or mitigation of exploitable vulnerabilities.
Learn From Near-Misses and Identify Accident Precursors
Many accidents have been preceded by close-calls, for instance when only one event did not occur in a known accident sequence. That these have not turned into a disaster has sometimes been viewed as evidence that the system needs no correction. Pro-active risk management of course, is the best way to avoid disasters. Yet, industries and regulators seem to believe at times that they should not intervene because the system has worked and no disaster has occurred –even if only by chance. The experts who claimed after a small tremor that all was safe in L’Aquila (Italy) where a large earthquake then occurred shortly after were relying on a recent occurrence of a false alert and failed to communicate to the public the fact that small shocks can also be precursors of large ones. In the case of the 2011 accident at the Macondo well, the regulators as well as the three companies involved did not intervene when they knew that some worrisome near-misses had occurred (presuming that they were doing well enough), and decided to ignore precursors and test results (NAE, 2012) presumably for a variety of immediate benefits.
Establish and Maintain a Culture of Safety.
It is possible to deliberately create and maintain a safety culture that reduces accident risks and losses. This requires acting beyond the classic ritual statements of “safety first”. A safety culture starts at the head of an organization, with a true commitment to recognize and properly manage unavoidable tradeoffs, and by training those who are closest to operations to make appropriate decisions when needed. Therefore the deliberate design and development of highly reliable organizations (HROs) typically emphasize adopting a vigilant, risk-aware mind set and instillingthe following five principles throughout the organization: preoccupation with failure at all levels and by all hands; reluctance to jump to conclusions or simplify interpretations of data and anomalies; sensitivity to operations at all levels; commitment to resilience; and deference to relevant expertise, rather than to authority (Weick and Sutcliffe, 2007).

A key part of a safety culture thus involves the incentives provided by the management. The structure and the procedures of organizations such as an oil company reflect an attitude at the top of the corporation that permeates all levels of the organization. Concretely, the incentives, constraints, and directions explicitly communicated to employees shape their decisions, especially when they have little time to react or little information to evaluate their decisions. This was one of many problems at the Fukushima Daiichi nuclear power plant, where operators had to wait for hours before deciding on their own to flood a crippled reactor. It was also true on the Deepwater Horizon platform, where ignoring negative pressure tests results contributed to the already high risks of an accident (NAE, 2012). Economic incentives that encourage motivated reasoning may thus distort risk-taking and risk-management decisions. As pointed out earlier, organizations that reward exclusively a production level and de facto penalize those who slow down production, put at risk not only their employees but also possibly, the general public both from a safety and a financial point of view.

Put the Right People in the Right Place with the Right Knowledge, Incentives and Resources
Training and learning are two of the most important requirements for effective risk management. Risk analysis can clarify the effectiveness and performance of risk management decisions and their importance in affecting outcomes. The results, in turn, allow assessing where additional resources and training, as well as changes in incentives and responsibilities, are most likely to pay off in reduced risks and improved performance. Having examined what drives the operators of a complex system (e.g., the conductors of high-speed trains), one can also review management procedures, structure and culture for fitness to meet the needs of both regular operations and responses to crisis. In normal operations, disciplined rule-following can protect us against the temptations, heuristics and biases that undermine so much human decision-making. These range from succumbing to short-run impulses that we may come to regret such as hyperbolic discounting (Chapter 12; Lehrer, 2009; Gardner, 2009), to letting fears, doubts, and desire control decisive actions which, upon reflection, no one favors upon reflection. (Recall that hyperbolic discounting describes “present-biased” preferences in which the same delay in reward is valued at different rates in the present than in the future, e.g., if $10 now is preferred to $20 in one year, yet $20 in six years is preferred to $10 in five years.) On the other hand, when reality turns to crisis or emergency situations, narrow rule-following can lead to blinkered vision and to abdication of the responsibility, creativity, and active cooperation needed for adaptive responses to the unexpected (Harford, 2011). A key challenge in many organizations is to know when to shift from normal procedures to emergency response, which implies that crisis signals have been observed and transmitted in time for quick effective response.

In that context, where operators can face unexpected delays and problems, it is essential to provide people with reasonable amounts of resources and deadlines, and to be willing to make adjustments. Otherwise, agents might satisfy the managers by cutting corners in ways that they may not even imagine until and unless they see consequent failures (Garber and Pate-Cornell, 2012). Therefore, when managers set these constraints they have to ask themselves what are their “shadow price”, i.e., by how much would one reduce the failure risk if one relaxed that constraint by one unit (one more day?); or on the contrary, whether one can tighten these constraints at a net benefit.

Clearly Define Leadership and Responsibilities
Key to the effectiveness of managers is their leadership in providing role models, and setting the tone for the organization’s performance. Leadership in a risk management context implies not only having (or deferring to) relevant knowledge and authority but also establishing clear lines of accountability and building trust from the people involved that their leaders can and will make proper and prudent decisions in difficult situations.

Who is responsible for avoiding accidents and mishaps? There are often several lines of responsibility and accountability, which should be properly defined and coordinated. The feeble defense of “responsible but not guilty” was used, for instance, by a high government official head of a health ministry in Europe in 1991, after contaminated blood infected a large number of people. The question of course, is: what constitutes guilt on the part of a leader who fails to define proper procedures and ensure their application? Another failure of leadership can occur when a conflict of authority emerges from a two-head structure. For instance, a surgeon and an anesthesiologist who disagreewhen neither of them has the ultimate decision making power can cause (and have caused) the death of a patient (Paté-Cornell et al., 1997). It may be possible to pinpoint precisely an error at the bottom of the organizational hierarchy that has led to an accident sequence. (In the case of the Piper Alpha accident in 1988, a young worker made the mistake of leaving the work of fixing a pump unfinished at the end of a day and failed to tag the pump as remaining to be fixed.). But, as in the case of rogue traders, the overall question of supervision, incentives and safety culture emanate directly from the leadership of the company and the regulators.

Leadership is thus a key ingredient of a solid system of risk management decision making in which the decision maker hears the message on time, understands it (and the uncertainties involved if any) and is able and ready to act when needed. The decision makers must be willing to know the truth, to make difficult choices and trade-offs of what is tolerable and what is not, and to decide when it is time to shift from regular operations to crisis management with the ability to make quick, well informed decisions. When the risk is born by a group of people, this requires a collective decision process, able to balance the interests and the safety of different groups, and the overall costs and benefits.
Share Knowledge and Experience Across Organizations
Not all risk management responsibilities can or should be defined within a specific organization. Distributed control of risks, shared among multiple organizations or individuals, also creates a need for legal and institutional frameworks to clearly define and enforce rights and duties. Clarifying whose responsibility it is to avoid risks that arise from joint decisions14can reduce the average costs of risk management. To providea rational basis for coordinating liability and incentives to reduce the costs of risk externalities and jointly caused risks in a society of interdependent agents, one might adopt several possible principles. In the economic analysis of law (Posner 1998), the Learned Hand formula, discussed further in the next chapter, states that parties should take additional care if and only if the expected marginal benefit of doing so exceeds the expected marginal cost (Feldman and Kim, 2005). Similarly, the “cheapest cost avoider” principle states that the party who can most cheaply avoid a jointly created risk should do so.

Accidents sometimes reveal the existence of information in some parts of industry that could have saved others. Some organizations successfully permit sharing that critical information. The Institute of Nuclear Power Operations (INPO) provides a practical example of such an organization (Reilly, 2013). Created in the wake of the Three Mile Island accident, INPO provides a forum where industry managers can discuss existing problemsbehind closed doors with the support of the regulator (in the US, the NRC). It has the role of an internal watchdog, regularly rating each power plant. These ratings, in turn, influence the insurance rate of the plants thus promoting strong incentives for excellence in safety. What makes such an organization successful is the combination of peer pressures, of a forum for internal discussion of potential problems, blunt assessment of plant performance and the “teeth” provided by financial incentives. What sometimes makes it difficult to generalize the model is the competition among the organizations involved and the global nature of some industries such as the oil market.

Successful risk management is usually a cooperative enterprise. Successful cooperation, in turn, requires moving past blame-casting and excuse-giving to understand the causes and conditions that contribute to catastrophes and improve the system, or conversely, that promote safety in the face of unanticipated challenges. Prominent among the addressable drivers of safety are vigilance and readiness to perceive and respond to anomalies, determination and ability to learn from experience, eagerness to continually probe and update assumptions in light of new information, and capacity to adapt creatively and cooperatively when conditions change. Clear lines of duty and responsibility for risk avoidance, together with discipline and training in following well-conceived procedures and rules for routine safe operation of complex systems, are key contributors to safety cultures that work. At the same time, having the wisdom, incentives, know-how, and experience in team problem-solving required to step outside such boundaries and improvise when needed, is essential for successful risk management in the face of novel threats. These are generally teachable and learnable skills.

We propose that improved practices of risk analysis, quantification and management should be built on technical and cultural foundations, which encompass expertise both in reducing routine risks and in responding to novel ones. Such risk management practices should rely less on blame-casting and excuse-making than in the past. They will need to acknowledge that human error is not necessarily the main driver of failures in an increasingly complex and interconnected world, and that systems should be designed to withstand such errors. Unprecedented hazards, fat-tailed distributions, and risk contagion leading to cascading failures are increasingly recognized as drivers of some of the most conspicuous modern risks, from power outages to epidemics to financial failures. Improved risk management practices should thus increasingly rely on intelligent engagement with our uncertain and changing world. They should build on the key principles we have touched upon: leadership and accountability;robust design (decoupling subsystem whenever possible); vigilant and open-minded monitoring; continual active testing of assumptions and systems; deliberate learning; optimal trade-offs of the costs and benefits of gathering further information before acting; well-trained and disciplined habits of coordination; and ability to cooperate quickly and effectively in response to new threats. These principles have been valuable foundations for effective risk management when they were applied in the past. They should become common practice in the future.


We thank Warner North for useful suggestions that led to a shorter, clearer exposition.


Dostları ilə paylaş:
1   ...   46   47   48   49   50   51   52   53   ...   57

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2017
rəhbərliyinə müraciət

    Ana səhifə