Causal Analytics for Applied Risk Analysis Louis Anthony Cox, Jr

Yüklə 3,36 Mb.
ölçüsü3,36 Mb.
  1   2   3   4   5   6   7   8   9   ...   57

Causal Analytics for Applied Risk Analysis
Louis Anthony Cox, Jr.

Douglas A. Popken

Richard X. Sun

Christine and Emeline
Individual, group, organizational, and public policy decisions are often disconcertingly ineffective. They produce unintended and unwanted consequences, or fail to produce intended ones, even after large expenditures of hope, time, and resources. The difficulty of achieving unambiguous successes, in which costly actions or policies produce large, clear net benefits that even those who initially doubted find compelling after the fact, has been noted in areas as varied as personal financial decisions, corporate business decisions, engineering infrastructure decisions, non-profit initiatives for poverty disruption or delinquency prevention, public health efforts to curb the emergence of antibiotic-resistant super-bugs, and regulation of pollutants to improve public and occupational health.

This book is about how to make more effective decisions – that is, decisions that are more likely to cause preferred outcomes and to avoid undesirable ones – by understanding and fixing what so often goes wrong. We believe that the most common reason for disappointing results from well-intended policies and actions is inadequate understanding of the causal relationships between actions and probabilities of outcomes. Actions guided by traditional statistical analyses of association patterns in observational data, such as regression modeling or epidemiological estimates of relative risk ratios, usually cannot be relied on to achieve their objectives because these traditional methods of analysis are usually not adequate for determining how changing some variables will change others. But that is what decision-makers must know to make well-informed choices about what changes to implement. This book is therefore devoted to causal analytics methods that can provide answers to the crucial causal question of how changing decision variables – the things that a decision-maker or policy-maker can control or choose – changes probabilities of various outcomes. It presents and illustrates models, algorithms, principles, and software for deriving causal models from data and for using them to optimize decisions, evaluate effects of policies or interventions, make probabilistic predictions of the values of as-yet unobserved quantities from available data, and identify the most likely explanations for observed outcomes, including surprises and anomalies.

The first two chapters survey modern analytics methods, focusing mainly on techniques useful for decision, risk, and policy analysis. They emphasize how causal models are used throughout the rest of risk analytics in detecting and describing meaningful and useful patterns in data; predicting outcome probabilities if different courses of action are followed; identifying and prescribing a best course of action for making preferred outcomes more probable; evaluating the effects of current or past policies and interventions; and learning from experience, either individually or collaboratively, how to make choices that increase the probabilities of preferred outcomes. Chapter 2 also introduces the Causal Analytics Toolkit (CAT), a free in-browser set of analytics software tools available at, to allow readers to perform the analyses described or to apply modern analytics methods to their own data sets. Chapters 3 through 11 illustrate the application of causal analytics and risk analytics to practical risk analysis challenges, mainly related to public and occupational health risks from pathogens in food or from pollutants in air. Chapters 12 through 15 turn to broader questions of how to improve risk management decision-making by individuals, groups, organizations, institutions, and multi-generation societies with different cultures and norms for cooperation. They examine organizational learning, social risk management, and intergenerational collaboration and justice in managing risks and hazards.

Throughout the book, our main focus is on introducing and illustrating practical methods of causal modeling and analytics that practitioners can apply to improve understanding of how choices affect probabilities of consequences and, based on this understanding, to recommend choices that are more likely to accomplish their intended objectives. We believe that the analytics and big data revolutions now underway will become much more valuable as methods and software for causal analytics become more widely used to better understand how actions and policies affect outcomes.

This book has grown out of efforts over the past decade to understand and explain how to use data and algorithms to determine as accurately, objectively and reproducibly as possible the effects caused by changes in decisions, actions, or policies. This quest has been inspired, encouraged, and supported by many people and organizations. It is a pleasure to thank them.

Chapters 1 and 2 are based largely on a pedagogical approach developed to quickly teach non-specialists about information-based causal analytics methods as part of professional development and academic graduate courses taught by Tony Cox in 2017. These included a course on Decision Analysis at the University of Colorado at Denver and professional courses at the annual meetings of the Society for Benefit Cost Analysis (SBCA), the Society for Epidemiologcal Research (SER), and the American Industrial Hygiene Association (AIHce). Bruce Copley, Dennis Devlin, Dale Drysdale, Susan Dudley, Gary Kochenberger, and Deborah Kellog believed in and enthusiastically supported development of a teaching approach and course materials that sought to make key ideas and methods of modern causal analytics accessible to a wider audience. We thank them.

The approach taken in those courses and in Chapter 2 of this book emphasizes the concepts and principles behind current causal analytics algorithms using a minimum of specialist jargon and mathematical notation, and then makes algorithms themselves readily available through software that can be used without learning the underlying R or Python languages and packages. Course materials are available at these links:



The Causal Analytics Toolkit (CAT) software and an explanation of its goals are available at these links:



Initial funding for development of CAT was provided by The George Washington University Regulatory Studies Center. Subsequent development of its Predictive Analytics Toolkit (PAT) module, dicussed in Chapter 2, and a port from an Excel add-in version to a cloud-based version, were supported in part by the American Chemistry Council. We thank Susan Dudley of the GWU Regulatory Studies Center and Rick Becker of the American Chemistry Council for their support and vision in making free, high-quality analytics software available to interested users via CAT.

The applications, ideas and principles in Chapters 3-15 are based mainly on recent journal articles. Material from the following articles has been used with the kind permission of Wiley-Blackwell, the publishers of Risk Analysis: An International Journal.

  • Cox LA Jr, Popken DA. Quantitative assessment of human MRSA risks from swine. Risk Analysis. 2014 Sep;34(9):1639-50 (Chapter 6)

  • Cox LA Jr. Overcoming learning-aversion in evaluating and managing uncertain risks. Risk Analysis. 2015 Oct; 35(10) (Chapter 12). (Thanks to Jim Hammitt and Lisa Robinson for a fascinating workshop at the Harvard Center for Risk Analysis that stimulated this work.)

  • Paté-Cornell E, Cox LA Jr. Improving risk management: from lame excuses to principled practice. Risk Analysis. 2014 Jul;34(7):1228-39. (Chapter 13)

Material from the following articles has been used with the kind permission of their publishers:

  • Cox LA, Popken DA, Kaplan AM, Plunkett LM, Becker RA. How well can in vitro data predict in vivo effects of chemicals? Rodent carcinogenicity as a case study. Regulatory Toxicology and Pharmacology. 2016 Jun; 77:54-64.

  • Cox LA Jr. Socioeconomic and air pollution correlates of adult asthma, heart attack, and stroke risks in the United States, 2010–2013. Environmental Research. 2017 May;155: 92-107. (Chapter 3)

  • Cox LA, Schnatter AR, Boogaard PJ, Banton M, Ketelslegers HB. Non-parametric estimation of low-concentration benzene metabolism. Chemo-Biological Interactions. Sep. 2017. (Chapter 4)

  • Cox LA Jr. 2015. Food microbial safety and animal antibiotics. Chapter 15 in Chen C-Y, Yan X, Jackson CR (Eds). Antimicrobial Resistance and Food Safety: Methods and Techniques. Elsevier, New York. (Chapter 5)

  • Popken DA, Cox LA Jr. Quantifying human health risks caused by Toxoplasmosis from open system production of swine. Human and Ecological Risk Assessment. 2015 Oct 3; 21(7): 1717-1735

  • Cox, LA Jr, Popken DA. Has reducing PM2.5 and ozone caused reduced mortality rates in the United States? Annals of Epidemiology. 2015 Mar;25(3):162-73. (Chapter 10)

  • Cox LA Jr. How accurately and consistently do laboratories measure workplace concentrations of respirable crystalline silica? Regul Toxicol Pharmacol. 2016 Nov;81:268-274.  (Chapter 11)

  • Cox JA Jr., Cox ED. (2016) Intergenerational Justice in Protective and Resilience Investments with Uncertain Future Preferences and Resources. Chapter 12 in P. Gardoni, C. Murphy, and A. Rowell (Eds). Risk Analysis of Natural Hazards: Interdisciplinary Challenges and Integrated Solutions. Springer. New York, New York. (Chapter 15)

We thank the publishers and coauthors of these works.

Discussions with Ron Josephson of the United States Environmental Protection Agency (EPA) in the context of reviewing research proposals on health effects of air pollution helped to inspire the idea of applying causal analysis methods to determine value of information in causal networks (Chapter 2). We thank Dennis Devlin and Bruce Copley of Exxon-Mobil and Will Ollison of the American Petroleum Institute for stimulating conversations and comments and for their unswerving commitment to discovering objective scientific truth from data to inform more effective decision and policies. As we have worked to develop and apply software to help automatically discover objective scientific truth about causality from data, we have found that this focus is not always popular and that industry-initiated thought leadership in pursuing more objective and reliable scientific inference is not always welcome. Advocates of expert judgment-based and modeling assumption-based approaches to causality have sometimes greeted with skepticism and even hostility the ideas that computer algorithms can now be far more accurate and objective than human experts in discovering true causal relations in data, and in identifying and rejecting false causal hypotheses; and that modeling judgments and expert interpretations of statistical patterns are neither necessary nor desirable for drawing valid causal inferences from data. We expect this analytics-centric perspective to continue to grow in popularity as causal discovery algorithms prove their value in a wide array of risk analysis applications. Meanwhile, we thank the visionaries who are pushing to make automated, objective, reproducible, algorithmic approaches to causal model discovery and validation a practical reality.

Finally, we thank Douglas Hubbard of Hubbard Decision Research for inviting lectures and discussions of the causal analytics framework in Chapter 2 at the American Statistical Association Symposium on Statistical Inference (Scientific Method for the 21st Century: A World Beyond p < 0.05 in October of 2017); and Seth Guikema of the University of Michigan for inviting the 2017 Wilbert Steffy Distinguished Lecture on Causal Analytics for Risk Management: Making Advanced Analytics More Useful at the University of Michigan Department of Industrial Engineering and Operations Research in November of 2017. The opportunity to prepare and present these lectures and to participate in the very stimulating discussions that followed contributed to the final exposition in Chapters 1 and 2.

Short Table of Contents

Part 1. Concepts and Methods of Causal Analytics

  1. Causal Analytics and Risk Analytics

  1. Causal Concepts, Principles, and Algorithms

Part 2. Descriptive Analytics in Public and Occupational Health

  1. Descriptive Analytics for Public Health: Socioeconomic and Air Pollution Correlates of Adult Asthma, Heart Attack, and Stroke Risks

  1. Descriptive Analytics for Occupational Health: Is Benzene Metabolism in Exposed Workers More Efficient at Very Low Concentrations?

  1. How Large are Human Health Risks Caused by Antibiotics Used in Food Animals?

  1. Quantitative Risk Assessment of Human Risks of Methicillin-Resistant Staphylococcus aureus (MRSA) Caused by Swine Operations

Part 3. Predictive and Causal Analytics

  1. Attributive Causal Modeling: Quantifying Human Health Risks Caused by Toxoplasmosis From Open System Production Of Swine.

  1. How Well Can High-Throughput Screening Test Results Predict Whether Chemicals Cause Cancer in Mice and Rats?

  1. Mechanistic Causality: Biological Mechanisms of Dose-Response Thresholds for Inflammation-Mediated Diseases Caused by Asbestos Fibers and Mineral Particless

Part 4. Evaluation Analytics

  1. Evaluation Analytics for Public Health: Has Reducing Air Pollution Reduced Mortality in the United States?

  1. Evaluation Analytics for Occupational health: How accurately and consistently do laboratories measure workplace concentrations of respirable crystalline silica?

Part 5. Risk Management: Insights from Prescriptive, Learning, and Collaborative Analytics

  1. Improving individual, group and organizational decisions: Overcoming learning-aversion in evaluating and managing uncertain risks

  1. Improving organizational risk management: From Lame Excuses to Principled Practice


  1. Improving institutions of risk management: Uncertain causality and judicial review of regulations

  1. Intergenerational justice in protective and resilience investments with uncertain future preferences and resources

Detailed Table of Contents

Part 1. Concepts and Methods of Causal Analytics

  1. Causal Analytics and Risk Analytics

    1. Why Bother? Benefits of Causal Analytics and Risk Analytics

    2. Who Should Read this Book? What Will You Learn? What is Required?

    3. What Topics Does this Book Cover?

    4. Causality in Descriptive Analytics

      1. Example: Did customer satisfaction improve?

      2. Example: Simpson’s Paradox

      3. Example: Visualizing air pollution-mortality associations in a California data set

      4. Example: What just happened? Deep learning and causal descriptions

    1. Causality in Predictive Analytics

      1. Example: Predictive vs. Causal Inference – Seeing vs. Doing

      2. Example: Non-identifiability in Predictive Analytics

      3. Example: Anomaly detection, predictive maintenance, and cause-specific failure probabilities

    1. Causal Models Used in Prescriptive Analytics

      1. Normal Form Decision Analysis

        1. Example: Identifying the Best Act in a Decision Table

        2. Example: Optimizing Research Intensity

        3. Example: Optimal Stopping in a Risky Production Process

        4. Example: Harvesting Timber

      2. Markov Decision Processes

      3. Improving MDPs: Semi-Markov Decision Processes and Discrete-Event Simulation (DES) Models

      4. Performance of Prescriptive Models

      5. Dynamic Optimization and Deterministic Optimal Control

        1. Example: Optimal Harvesting

      6. Stochastic Optimal Control, Hidden Markov Models, Partially Observable Markov Decision Models (POMDPs)

      7. Statistical Decision Theory

      8. Simulation-Optimization

        1. Example: An influence diagram (ID) DAG model for optimizing emissions reduction

        2. Example: Forecasting Policy Impacts – Invariance, Causal Laws, and the Lucas Critique in Macroeconomics

    2. Causal Study Design and Analysis in Evaluation Analytics

      1. Randomized Control Trials (RCTs)

        1. Example: Invariant CPTs, Generalization, and Transportability of Causal Laws

      2. Quasi-Experiments (QEs) and Intervention Time Series Analysis are Widely Used to Evaluate Impacts Causes by Interventions

        1. Example: Did Banning Coal Burning in Dublin Reduce Mortality Rates?

      3. Counterfactual and Potential Outcome Framework: Guessing What Might Have Been

        1. Example: What Were the Effects of a Public Smoking Ban Policy Intervention on Heart Attack Risks?

      4. Change Point Analysis (CPA) and Sequential Detection Algorithms

      5. A Causal Modeling Perspective on Evaluating Impacts of Interventions using CPTs

      6. Using Causal Models to Evaluate Total Effects vs. Direct Effects

      7. Using MDPs and DES Causal Models to Evaluate Policies and Interventions

    3. Causality in Learning Analytics

    4. Causality in Collaboration Analytics

      1. Causal Models in Game Theory: Normal and Extensive Forms, Characteristic Functions

      2. Causal Models for Multi-Agent Systems

    5. Conclusions

  1. Causal Concepts, Principles, and Algorithms

    1. Multiple meanings of “Cause”

    2. Probabilistic Causation and Bayesian Networks

      1. Technical Background: Probability Concepts, Notation and Bayes’ Rule

        1. Example: Joint, Marginal, and Conditional Probabilities for Answering Queries

        2. Example: Sampling-Based Estimation of Probabilities from a Database Using R

      2. Bayesian Network (BN) Formalism and Terminology

      3. Using BN Software for Probability Predictions and Inferences

        1. Example: A Two-Node BN for Disease Diagnosis using Netica

        2. Example: Bayesian inference in a small BN – The family out problem

      4. Practical Applications of Bayesian Networks

      5. Non-Causal Probabilities: Confounding and Selection Biases

        1. Example: Confounding – Effects of Common Causes

        2. Example: Collider Stratification and Selection Bias – Conditioning on a Common Effect

      6. Causal Probabilities and Causal Bayesian Networks

        1. Example: Modeling Discrimination in College Admissions – Direct, Indirect, and Total Effects of Gender on Admissions Probabilities

        2. Example: Calculating Effects of Interventions via the Ideal Gas Law

      7. Dynamic Bayesian Networks

    3. Other Causal Models Equivalent to BNs

      1. Fault Tree Analysis

        1. Example: A Dynamic Fault Tree Calculator

      2. Event Tree Analysis

      3. Bow Tie Diagrams for Risk Management of Complex Systems

      4. Markov Chains and Hidden Markov Models

      5. Probabilistic Boolean Networks

      6. Time Series Forecasting and Predictive Causation

      7. Structural Equation Models (SEMs), Structural Causation, and Path Analysis Models

      8. Influence Diagrams

        1. Example: Decision-Making with an Influence Diagram – Taking an Umbrella

      9. Decision Trees

      10. Markov Decision Processes (MDPs) and Partially Observable MDPs (POMDPs)

    4. Predictive Causality and Predictive Analytics

      1. Classification and Regression Tree Models

        1. Example: Seeking Possible Direct Causes Using Classification and Regression Tree (CART) Algorithms

        2. Example: Conditional Independence Tests Constrain DAG Structures

      2. The Random Forest Algorithm: Importance Plots and Partial Dependence Plots

      3. Causal Concentration-Response Curves, Adjustment Sets, and Partial Dependence Plots for Total and Direct Effects in Causal Graphs

        1. Including Knowledge-Based Constraints and Multiple Adjustment Sets

        2. Power Calculations for Causal Graphs

      4. Predictive Analytics for Binary Outcomes: Classification and Pattern Recognition

    5. Causal Discovery: Learning Causal BNs from Data

    6. Comparison of Causal Discovery to Associational Causal Concept: Updating the Bradford Hill Considerations

      1. Strength of Association

        1. Example: Confirmation Bias and the Wason Selection Task

      2. Consistency of Association

      3. Plausibility, Coherence, and Analogy of Association

      4. Specificity, Temporality, and Biological Gradient

      5. Methods and Examples of Associational Causation: Regression and Relative Risks

        1. Example of Associative vs. Manipulative Causation in Practice: The CARET Trial

        2. Example of Association Created by Regression Model Specification Error

      6. Relative Risk and Probability of Causation in the Competing Risks Framework

      7. Conclusions on Associational Causation

    7. Comparison of Causal Discovery to Attributive Causal Methods

      1. Example: Attributive Causation is Not Manipulative Causation

      2. Example: 9 Million Deaths per Year Worldwide Attributed to Pollution

    8. Comparison of Causal Discovery to Counterfactual Causal Methods

      1. Example: Attribution of Rainfall to Climate Change, and the Indeterminacy of Counterfactuals

    9. Comparison of Causal Discovery to Structural and Mechanistic Causal Modeling

      1. Example: Dynamic Causal Analysis of the Level of a Single Variable in a Compartment

      2. Example: A CPT for a One-Compartment Model with Uncertain Inputs

      3. Example: Causal Reasoning about Equilibria and Comparative Statics

    10. Historical Milestones in Development of Computationally Useful Causal Concepts

    11. Conclusions

Dostları ilə paylaş:
  1   2   3   4   5   6   7   8   9   ...   57

Verilənlər bazası müəlliflik hüququ ilə müdafiə olunur © 2019
rəhbərliyinə müraciət

    Ana səhifə