Comparison of Causal Discovery to Attributive Causal Methods In addition to associative causation, two other causal concepts with a major impact on modern epidemiology, social statistics, and policy evaluation studies are attributive causation and counterfactual (or potential outcomes) causation. Insofar as these share with associational causation the limitation that they do not provide valid information about how outcome probabilities would change if different actions or interventions were taken, they are not suitable for informing policy decisions about which actions to take. Like associational analyses, they are nonetheless widely used for this purpose. This section extends points already discussed for associational causation to these other two concepts.
Attributive causation is based on the key idea of attributing observed differences in response rates between exposed and unexposed groups, or between more-exposed and less-exposed groups, to the differences in their exposures. Standard measures used in epidemiology that are based on attributing differences in responses to differences in exposures include the following (www.med.uottawa.ca/sim/data/PAR_e.htm):
The attributable risk (AR) of a disease or other adverse outcome among exposed individuals is the difference in incidence rates between exposed and unexposed individuals, measured in units such as cases per person-year, or cases per 1,000, per 10,000 or per 100,000 person-years, in the population. AR is commonly interpreted as the excess incidence rate of disease caused by exposure among exposed individuals. This causal interpretathas essentially the status of a definition in many epidemiology textbooks. Likewise, the attributable number (AN) of cases per year in a population is the attributable risk multiplied by the number of people exposed. It is commonly interpreted as the number of extra cases per year caused by exposure, again without further inquiry into whether exposure actually does cause the cases.
The population attributable risk (PAR) or population attributable fraction (PAF) is derived from the relative risk (RR), i.e., the ratio of disease rates in the exposed and unexposed populations, together with the prevalence of exposure in the population, P, i.e., the fraction of the population that is exposed. It can be expressed via the formula
PAR = P(RR - 1)/[1 + P(RR-1)],
It is commonly (mis)interpreted in terms of manipulative causation, as the fraction of cases that would be prevented if exposure were removed.
Burden of disease (BoD) calculations extend the PAR formula to allow for multiple possible levels of exposure, each with its own relative risk ratio and prevalence in the population. These methods for BoD calculations have been published and applied by the World Health Organization (WHO), which interprets them as if they indicated manipulative causation, providing illustrative calculations of how “if the risk factor were to be completely removed… the BoD reduction can be calculated from a simplified form of the above formula” involving prevalence rates and relative risks (Prüss-Üstün et al., 2003).
An intuitive motivation for these attributive causal concepts and formulas (and closely related or synonymous ones, e.g., etiologic fractions, population attributable fractions, probability of causation) is an assumption that observed differences in effects (responses) are explained by observed differences in causes (exposures). We will call this the attribution assumption. But it is not necessarily true. Differences in effects between exposed and unexposed individuals might instead have other explanations, such as some or all of those in Table 2.6. By contrast, the main intuitive motivation for the information-based methods of causal discovery emphasized in this chapter (right side of Table 2.4) is the information principle stating that values of causes (exposures) help to predict values of their direct effects (responses) in a DAG model; and that in time series data, changes in the values of causes help to predict changes in the values of their effect. In many settings, this is a much better proxy than the attribution principle for manipulative causation – the principle that changes in causes change the probability distributions of their direct effects.
Example: Attributive Causation is Not Manipulative Causation Suppose that teen pregnancy rates are found to be higher among young women in a certain population who are enrolled in a school lunch program than among young women who are not. Solely for purposes of a simple calculation (i.e., the example is intended to be simple rather than realistic), suppose the following characteristics always occur together (i.e., a young woman who has one has all); and that all teen pregnancies in this population occur among young women with these characteristics and not among young women who do not. The characteristics are: (a) Enrolled in school lunch program; (b) Belong to school chess club; (c) Enrolled in a Latin honors class; (d) Smokes cigarettes; (e) Drinks Coke instead of Pepsi. Then the traditional textbook attributable risk and BoD formulas just described would attribute 100% of teen pregnancies in this population to the school lunch program. They would also attribute 100% of the teen pregnancies in this population to each of the other characteristics that cluster with the enrollment in the school lunch program, i.e., to membership in the chess club, the Latin honors class, cigarette smoking, and drinking Coke instead of Pepsi. Each of these factors would have a probability of causation of 100% for causing teen pregnancy. This illustrates the distinction between the meaning of causation as defined by the World Health Organization and other authorities based on relative risk ratios and attribution formulas, and the usual meaning of causation. Following current practice, these attributable risk calculations could be used to inform policy makers that requiring young women to drink Pepsi instead of Coke would prevent teen pregnancies in this population; likewise, cancelling the school lunch program or the chess club or the Latin honors class or ending cigarette smoking would each be predicted to prevent 100% of teen pregnancies in this population. This illustrates the difference between policy recommendations based on attributive measures of causation and policy recommendations based on manipulative or mechanistic causation, which would recognize that the attributable risk calculations have no implications for how or whether any of these possible interventions would change teen pregnancy rates in the population. Attributable risk calculations based on relative risks correspond to using only the first of the Hill considerations, strength of association, without considering others such as biological plausibility or coherence with knowledge of how outcomes are caused.
Example: 9 Million Deaths Per Year Worldwide Attributed to Air Pollution A more realistic example of the same methodological points arises from epidemiology. In October of 2017, international headlines announced that “Pollution kills 9 million people each year.” Accompanying articles warned that it “costs trillions of dollars every year” and “threatens the continuing survival of human societies.” Underlying these sensational announcements was a Burden-of-Disease (BoD) study published in the Lancet and communicated to the press via carefully designed infographics, jointly published with the advocacy network Global Alliance on Health and Pollution (GAHP), urging more funding for pollution research, advocating clean energy, and linking air pollution to climate change (Lancet, 2017). The technical basis for the study was the uncontroversial observation that mortality rates are higher in some countries and regions than in others. Higher mortality rates tend to cluster with conditions such as lower GDP per capita, higher poverty, poorer nutrition, lower education, higher illness rates, higher pollution levels (e.g., from use of indoor dung-burning stoves), lower savings rates, and poorer healthcare. The BoD study attributed the higher mortalities observed in such circumstances to higher pollution; it could equally well have attributed them to any other factor in the cluster, much as in the teen pregnancy example. Such attributive causation does not imply that reducing pollution without addressing the other factors that cluster with it would have any effect on deaths per year.
Indeed, since the number of deaths per year is the number of live births per year one lifetime ago, exposure to pollution cannot permanently increase average deaths per year by 9 million (or any other amount) unless it increases previous birth rates correspondingly – an effect not usually ascribed to pollution. What exposure might do is to change life lengths. But if life length is likened to a pipeline, with the number of people exiting each year (dying) equal to the number entering it (being born) one life length ago, then it is clear that changing the length of the pipeline does not permanently change the number of people exiting from it per year. (If pollution were to cause everyone to die a year earlier than they otherwise would have, for example, then the excess deaths that occur this year instead of next year would be offset by the fewer deaths that would have occurred this year but that occurred last year instead.) Headlines reflecting attributive causation, such as “Pollution kills 9 million people each year,” should not be misinterpreted as implying that there would be any fewer total deaths per year if pollution were eliminated. Nor should claims that reducing pollution has generated trillions of dollars per year in public health benefits be misunderstood as implying that those same benefits would not have occurred without the reduction in pollution. Attribution-based causal claims are akin to accounting decisions – to which factor or factors do we choose to assign responsibility for outcomes? – rather than facts about the world.
As illustrated in these examples, attributive causation and BoD calculations attribute differences in observed response rates between exposed and unexposed individuals to the observed differences in their exposures (possibly after matching on covariates such as demographic characteristics), whether or not changing exposures would change response rates. The attribution assumption can be justified in cases where the causal agent that produces responses is known, unique, and can be measured, as for many food borne diseases: risk of Salmonellosis depends on how much Salmonella is ingested and can be attributed to the source(s) of that exposure. Chapters 5-7 examine this approach in more detail. But in general, the attribution assumption may fail when unique causal agents do not exist, or are unknown or unmeasured.
Comparison of Causal Discovery to Counterfactual Causal Methods Enough dissatisfaction has accumulated with the attributive approach to inspire development of an alternative based on counterfactual comparisons of observed to unobserved outcomes. The intuitive motivation is the following counterfactual causation principle: Differences between observed response rates under real conditions and model-predicted response rates under hypothetical (counterfactual) conditions are attributed to the differences between real and modeled conditions. Thus, differences in causes (exposures) are assumed to create differences in their effects (responses) compared to what they otherwise would have been. The most common application of this principle is as follows:
Use a regression model to predict what the response rate among exposed people would have been had conditions been different, e.g., had they not been exposed, or had they been exposed to lower levels, of a hazard;
Compare this counterfactual response rate to the observed response rate under actual exposure conditions; and
Attribute the difference to the difference between modeled exposure conditions (e.g., no exposure) and actual exposure conditions.
Such counterfactual modeling can also be used to address what-if questions that manipulative causation cannot answer, such as “How would my salary today be different if my race, age, or sex were different?” Assuming that the regression model used describes what would have happened under other conditions allows modelers to answer questions about the effects of counterfactual conditions that cannot be achieved (and that may have no clear meaning) in the real world.
A well-known fundamental challenge for such counterfactual or potential outcomes modeling is that what would have been under counterfactual conditions is never observed. Thus, to apply this approach, what would have happened – e.g., the mortality rate among exposed people under different exposure conditions – must be guessed at or assumed. Without a well-validated causal model, guessing with high accuracy and confidence what would have happened under different conditions may be difficult or impossible. Thus, the counterfactual modeling approach requires having a well-validated causal model before it can give trustworthy estimates of the counterfactual responses on which its estimates of causal impacts of exposure depend. In essence, it simply begs the question of how to obtain such a model.
A recent technical advance, doubly robust estimation of average causal effects in populations, combines two potential outcomes methods, outcome regression modeling and propensity score modeling (Funk et al., 2011). It requires only one of the two models to be correctly specified in order to yield unbiased effects estimates. However, the problem of correctly specifying even one model remains. Even state-of-the-art potential outcomes algorithms such as targeted maximum likelihood estimation (TMLE), which have the double robustness property, still have substantial error rates and problematic performance in practice in many real data sets (Pang et al., 2017).
The example of the Dublin coal-burning ban discussed earlier illustrates some of the pitfalls of the counterfactual approach. The original study (Clancy et al., 2002) compared observed mortality rates following the ban to mortality rates in the same population before the ban. It attributed the quite significant difference between them to the effects of the ban. The authors concluded that the ban had caused a sizable reduction in mortality rates, based on the the modeling assumption that, in the absence of the ban, mortality rates would have remained at their original levels. By contrast, the follow-up investigation a decade later (Dockery et al., 2013) compared observed changes in mortality rates from before to after the ban in the affected population to changes in the mortality rates over the same period among people living outside the affected area. This time, the authors concluded that the ban had caused no detectable effect in all-cause mortality, based on the alternative modeling assumption that, in the absence of the ban, mortality rates would have fallen by as much in the population inside the area affected by the ban as they fell in the population outside it. Changing counterfactual assumptions about what would have happened to mortality rates in the absence of a ban completely changed the conclusions about the effects caused by the ban.
Many technical developments over the past four decades have been introduced to try to make the counterfactual approach logically sound and useful to practitioners, as discussed further in Chapter 14. Innovations include variations on the theme of stratifying or matching individuals in the exposed and unexposed groups on observed covariate values in hopes of making it more likely that differences in responses between the matched groups are caused by differences in exposure; and various types of conditioning on observed covariate values using regression models to play a role similar to matching or stratification in “adjusting” or “controlling” for non-exposure differences between the groups. However, none of these efforts overcomes the basic problem that counterfactual responses are speculative. Unless a valid causal model is obtained by other means, or randomized experiments are carried out, as in the case of the CARET trial, what the correct counterfactual values for responses are remains unknown.
Unfortunately, counterfactual modeling efforts have introduced many confusions and mistakes in attempted causal analyses. For example, it has gradually been understood that attempting to “control” for the effects of covariates by matching or stratifying on their values or conditioning on them in a regression model may create an artificial exposure-response association due to collider bias. Model specification errors in potential outcome regression models are almost certain to lead traditional potential outcomes methods to false-positive results, mistakenly rejecting the null hypothesis of no effect even when it is true, if sample sizes are large enough (the “g-null” paradox). Even if these problems could be resolved, the question that counterfactual causal analysis methods usually attempts to answer based on data and assumptions is how outcomes would have been different if exposures had been different (or if intervention or other conditions had been different) in specified ways. But this is not the same as the question that decision makers and policy analysts need answered, which is how future outcome probabilities will be different if different interventions are undertaken now. What responses would have been if different interventions had been made in the past is in general not the same as what they will be if those interventions are undertaken now, unless the answers are calculated using invariant and stationary causal laws (e.g., causal CPTs). But this again requires a valid causal model.
Counterfactual and potential outcome methods do not solve the challenge of developing valid causal models from data, but they can use valid causal models, if they are available, to estimate causal impacts of interventions and to answer what-if questions about how average responses would have differed if exposures had been different. Unfortunately, misspecified models can also be used to answer the same questions, but the answers may be wrong, and the errors they contain may be unknown because they are based in part on conjectures about what would have been instead of being based entirely on observed data. For practical purposes, therefore, we recommend using methods and principles of causal discovery such as those on the right side of Table 2.4 to learn valid causal models from data. Counterfactual analyses and interpretations can then be undertaken with these models if desired to address retrospective evaluation questions, such as how interventions (e.g., a coal burning ban) probably changed health outcomes, and to address prospective questions such as how a ban now would affect probabilities of future health outcomes.
Example: Attribution of Rainfall to Climate Change, and the Indeterminacy of Counterfactuals In August of 2017, Hurricane Harvey dumped more than 40 inches of rain on parts of eastern Texas in 4 days, leading to over $200 billion dollars of flooding damage in 2017 U.S. dollars. Speculation soon began about how much less rain might have fallen in the absence of man-made climate change. Simulation models were used to suggest numerical answers attributing a fraction of the unusual rainfall, such as 15%, to warming over the previous century, meaning that 15% less rain would have been expected had the warming not occurred. However, the correct answer depends on why, in the counterfactual scenario, warming would not have occurred. Positing the onset of a civilization-destroying ice age by 1990 that prevented further warming might have different implications for what would have happened to the rainfall in Texas in August of 2017 “had the warming not occurred” than if the explanation were instead earlier and more widespread adoption of nuclear power or other energy sources that reduced greenhouse gas emissions. This illustrates the indeterminacy of counterfactuals: assuming that some aspect of the world (temperature rise) were different does not by itself explain why it would have been different, and yet this may be crucial for determining what else would have happened (rainfall over Texas in August of 2017), or the probabilities of what else might have happened (probability distribution for rainfalls of different sizes).
Even if well-validated causal models are available to provide credible answers to what would have happened – or, more accurately, what the probability distribution of outcomes would have been – for different input assumptions or scenarios, which specific counterfactual input assumptions or scenarios should be used is typically under-determined by conditions that stakeholders ask about. Asking how much less rain would probably have fallen during Harvey had the warming in the previous century not occurred does not specify a causally coherent scenario about how Harvey would have occurred, and how it would have differed (e.g., producing more or less rainfall than that observed) without that warming. Indeed, it is highly nontrivial, and well beyond current weather and climate simulation capabilities (and perhaps mathematically impossible, if the assumed equations and conditions are not consistent), to identify initial conditions a century ago and plausible changes in decisions since then that would have produced both no warming in the previous century and also a Harvey-like storm over Texas in August, but with significantly less rainfall. Applying simulation models to hypothetical what-if input assumptions to produce corresponding probability distributions for outputs may yield conclusions that represent no more than user-selected inputs and modeling assumptions presented in the guise of discoveries.
To obtain valid counterfactual findings about how the real world would have been different if past inputs had been different, it is important to use validated causal models and causally coherent input assumptions – that is, initial conditions and input scenarios that explain how hypothesized conditions (such as an August 2017 storm over Texas) are created by the application of causal laws or validated models to the stated initial conditions and input scenarios. If a suitable Bayesian network model were available, for example, one could enter assumed findings, such as “Warming over past century = 0” and “Storm occurs over eastern Texas in late August of 2017 = TRUE” and then let the model compute the conditional probability distribution for “Amount of rainfall from storm” (or report that the assumed findings are mutually inconsistent, if that is the case). To avoid inconsistencies and ambiguities, hypothetical counterfactual conditions or constraints such as “A storm occurs in the same time and location, but without the previous century’s warming” should not be assumed, but should be derived (if possible) from causal models of the (random) consequences of explicitly stated initial conditions and input scenarios. Such specificity about explanations for assumed counterfactual conditions is not provided in statistical counterfactual models. It is seldom provided in simulation modeling. Without such specificity, attributions of excess rainfall or other quantities to partially specified causes, such as presence or absence of prior warming (without further explanation and detailed modeling), may have no clear meaning.
Comparison of Causal Discovery to Structural and Mechanistic Causal Modeling Long before the advent of causal Bayesian networks and other causal graph models, scientists and engineers successfully described causality in a variety of dynamic physical, socioeconomic, and biological systems using mathematical models, especially systems of ordinary differential equations (ODEs) or partial differential equations (PDEs) with algebraic constraints, to describe how the rates of change of some variables depended on the values of other variables. The equations represented invariant causal laws, such as that the rate of flow of a material across a boundary was proportional to the difference in its concentrations on the two sides of the boundary, or that the rate of change of material in a compartment was the difference between its rate of flow into the compartment (influx) and its rate of flow out of the compartment (efflux). The basic insight into causality was that initial conditions, subsequent exogenous inputs, and invariant causal laws – that is, structural equations that can be applied to the initial conditions and exogenous inputs, but whose form does not depend on them – determine both how the state of a system evolves over time, and also the changes in outputs caused by an exogenous change in the initial conditions or by changes in inputs. Equations 1.7 and 1.8 in Chapter 1 express this idea in mathematical notation and extend it to include stochastic systems in which probability distributions over states evolve over time based on initial conditions and inputs.
Example: Dynamic Causal Analysis of the Level of a Single Variable in a Compartment Setting: One of the simplest examples of dynamic causal analysis involves a single compartment with an inflow of u units of material per unit time; an outflow of y units per unit time; and a state x indicating the number of units of material in the compartment. The compartment could be a bathtub filling with water, a cell in a tissue exposed to a chemical, or a population with births plus immigrations as inflows and deaths plus emigrations as outflows. Suppose that the outflow is proportional to the current contents, y = kx, and that the inflow is constant. The inflow rate u and the outflow rate per unit of content, k, are exogenously determined inputs to the model.
Problem: Find the causal impacts on the steady-state equilibrium level in the compartment of the following two exogenous changes in inputs: (a) Cut u in half. (b) Cut k in half.
Solution: The ODE describing this one-compartment system is dx/dt = u-kx. In steady state equilibrium, the inflow and outflow are equal: u = kx. Hence the steady-state level in the compartment is x* = u/k, and the answers to the questions are that (a) Cutting u in half cuts the steady-state level in half; and (b) Cutting k in half doubles the steady-state level.
Discussion.In this explicit dynamic model, it is clear what is determined by what: the exogenous inputs k and u determine the rate of change in x (via dx/dt = u-kx), and hence the value of x over time given an initial value of x; and x and k at any moment determine the outflow y (via y = kx). The structure of the model can be diagrammed as follows, where arrows here indicate that the value of each variable is derived from the values of the variables that point into it.
Such a structure implies a partial causal ordering of its variables. The direct causes (parents) of a variable are the variables from which its value is derived, i.e., those that point into it. Nodes with no inward-pointing arrows represent exogenous inputs. This concept extends to even very large dynamic simulation models consisting of ODEs and algebraic formulas, allowing a causal ordering of variables based on the structure of derivations of their values from the exogenous inputs (Simon and Iwasaki, 1988).
Today, system dynamics modelingoffers software and guidelines for representing dynamic systems by ODEs and mathematical formulas. Modern system dynamics modeling software lets users easily draw diagrams with compartments (boxes) representing dynamic quantities (state variables), thick arrows representing dynamic flows (influxes and effluxes) into and out of the compartments, and thin information arrows showing how the values of some variables, such as inflow and outflow rates, depend on the values of others.
Figure 2.42 System dynamics modeling using the free InsightMaker software (https://insightmaker.com/node/3778 )
Figure 2.42 shows a portion of a system dynamics modeling tutorial using the free in-browser software InsightMaker. In this simple susceptible-infected-recovered (SIR) model of infectious disease dynamics in a population, the three possible states (compartments) for individuals at any moment are called Healthy (i.e., susceptible), Infected, and Immune (i.e., recovered and no longer susceptible). Initial conditions specify the number of people or the fraction of the population in each state. In this model, 100% of the population is initially in the Healthy state. Exogenous inputs are the Infection Rate (expressed in units of expected infections per healthy person per year) and the Recovery Rate (expressed in units of expected recoveries per infected person per year. More intuitively, this can be viewed as the reciprocal of average recovery time, so that a recovery rate of 2 would correspond to an average recovery time of half a year.) The equations describing causality specify that the flow from Infected to Immune is equal to the size of the infected population times the recovery rate; likewise, the flow from Healthy to Infected is the product of Healthy and Infection Rate. These laws could be written explicitly (and in earlier continuous simulation modeling languages, they had to be written explicitly), e.g., as the system of ODEs
However, it is more parsimonious, and is supported by current system dynamics software packages, for the user to simply specify the equations for each of the two flows:
Infection = Infection_Rate*Healthy
Recovery = Recovery_Rate*Infected.
The software can then complete the ODE specification using the conservation law that the rate of change for any compartment at any time is the difference between its influx and efflux rates then.
Once a system dynamics model has been fully specified by specifying its initial conditions (i.e., initial contents in each compartment), exogenous inputs, and equations for flows, it can be run to calculate the time courses of all variables. The software uses numerical integration algorithms to automatically solve for, or simulate, the values of all variables over any user-specified time interval. For the SIR model in Figure 2.42, the percents of the total population in each of the three states are shown over a time interval of 20 years. These trajectories are generated and displayed automatically by the Insight Maker solver. Special techniques of numerical integration for “stiff” systems can be applied if different variables have transients on very different time scale, but such details of the underlying solvers are typically hidden from users, who need only to specify models and run them to get outputs.
Dynamic models often contain cycles. Indeed, standard practice in system dynamics simulation modeling is to begin the formulation of a model by using qualitative causal loop diagrams showing signed arrows linking variables into networks of overlapping positive or negative feedback loops to understand which changes in variables tend to reinforce each other via positive loops, and which tend to counter-balance or stabilize each other via negative feedback loops. For example, the infection rate in Figure 2.42 might depend on the fraction of the population infected, Infected, creating a reinforcing causal loop between the number of people already infected and the rate of infection of susceptible people. Thus, dynamic structural models, consisting of ODEs and algebraic equations organized to allow simulation of the values of all quantities over time, given initial conditions and exogenous inputs, are not always equivalent to causal DAG models (Simon and Iwasaki, 1988). They provide time-dependent outputs and detailed descriptions of transients that are typically abstracted away in BNs, DBNs and other DAG descriptions.
Detailed dynamic causal simulation modeling methods include the following:
System dynamics modeling, such as the SIR model in Figure 2.42;
Agent-based models (ABMs, also supported by Insight Maker). In an ABM, local interactions among agents are described by rules or equations and the dynamic evolution of aggregate population variables emerges from these interactions.
Networks of ODEs. These are networks in which each variable changes its own level at a rate that depends on the levels of its neighbors. Important special cases include chemical reaction networks, “S-systems” (Savageau and Voit, 1987) of parametric ODEs used to describe metabolic networks and other biological networks, and related dynamic networks used in systems biology (Machado et al., 2011). Networks of ODEs are especially useful for studying how steady-state equilibrium levels of variables (if they exist) adjust in response to changes in initial conditions or exogenous inputs; how many different equilibria there are; the stability and basins of attraction for different equilibria; the sizes and durations of exogenous disturbances needed to move the network from one equilibrium to another; and whether there are stable periodic solutions or chaotic solutions.
Other network simulation models. There are many other network models in which the state of each element evolves based on the states of its neighbors, although networks of ODEs are among the most flexible and useful classes of such models. For example, Boolean logic networks model each variable as having only two values, conventionally represented by 0 (‘off”) or 1 (“on”). Each variable’s value in each period depends on the values of its parents in the previous period; the dependence can be deterministic (as in classical “switching networks” of Boolean elements) or probabilistic. Other network formalisms, such as Petri nets (in which a finite number of tokens at each node move among a finite number of places according to specified rules) and networks of finite-state machines (i.e., finite automata) have been subjects of much theoretical analysis and some applications in systems biology, e.g., in modeling gene regulatory networks (Lähdesmäki, 2006).
When enough knowledge about a system is available to create a well-validated dynamic simulation model, perhaps with some of its input values or initial conditions sampled from probability distributions representing uncertainty about their values, it can be used to answer questions about how changes in inputs have affected outome probabilities (for retrospective evaluation and attributive studies) or will affect them (for probabilistic forecasting, decision support, and policy optimization studies). Such mechanistic modeling provides a gold standard for causal modeling of well understood systems. DAG models can simplify and summarize the input-output relations calculated from these more detailed dynamic models of causal processes, e.g., by using conditional probability tables (CPTs) or other conditional probability models (e.g., CART trees, random forest ensembles, Monte Carlo simulation steps) in a BN to summarize the conditional probabilities of different steady-state output levels for different combinations of exogenous inputs and initial conditions, as determined by more detailed dynamic simulation modeling. Perhaps more importantly for many applications, DAG structures and BNs can also be generated directly from appropriate cross-sectional or time series data, as previously illustrated, even if a more detailed dynamic simulation model is not available.
The analogous task of learning ODE-algebraic or ODE network simulation models directly from input and output observations can be very challenging. A vast engineering literature on dynamic system identification algorithms deals largely with special cases such as linear time-invariant (LTI) systems or single-input, single-outputs (SISO) systems (Iserman and Münchhof, 2011). BNs, SEMs, and other DAG models and causal graphs provide less detailed representations of causal processes than dynamic simulation models, but are relatively easy to fit to data using non-parametric methods such as those described previously (e.g., using the bnlearn and CAT software). Although they usually cannot answer questions about the dynamic trajectories followed by variables as they adjust to changes in inputs, these models can provide very useful information about how outcomes (or their probabilities, in stochastic models) change as inputs are varied.
Example: A CPT for a One-Compartment Model with Uncertain Inputs Returning to the one-compartment model with input rate u and output y = kx when the compartment contains x units, recall that for any given initial value of x and for any specified values (or histories, if they are time-varying) of the exogenous inputs u and k, integrating the ODE dx/dt = u(t) - k(t)x(t) provides the time course for x(t). This eventually approaches the steady-state equilibrium value x* = u/k if both u and k are held fixed. Now, suppose that the future value of k is uncertain, being equally likely to be 0.2, 0.3, or 0.5. Then any choice of value for the fixed input u will induce a conditional probability distribution for the resulting steady-state value of x: it is equally likely to be 5u, 3.33u, or 2u. If u has only a few discrete possible values, or if it is discretized to a small grid of possible values, then a CPT displaying probabilities of 1/3 for each of the values 5u, 3.33u, or 2u for each value of u can be assembled. If u is continuous, then a simulation step that samples from these three values with equal probabilities, given any input value u, will represent this conditional probability relation without needing to store an explicit CPT.
Example: Causal Reasoning about Equilibria and Comparative Statics Economists, medical doctors, chemists, and other scientists and engineers make extensive use of causal models that study how the equilibrium in a system changes as exogenous inputs or constraints vary. A demand curve plotting quantity demanded against price shows the equilibrium quantity demanded at each price, without describing the adjustment process or detailing the transients needed to move to the new level when price changes. If the demand curve as a whole shifts up for a certain good (perhaps due to a supply failure for a competing good), while its supply curve remains fixed, then an economist can calculate the new, higher price and the quantity that will be produced and consumed at that price without modeling the transients that accomplish the change. Because it ignores transients, such comparison of equilibria before and after an exogenous change is called comparative statics: it is highly useful for predicting and quantifying the changes that will be caused by exogenous changes in policies, although dynamic models are needed in order to study their timing. Similarly in chemistry and other fields, including economics, Le Chatelier’s Principle states that exogenously varying one constraint or factor among many that jointly determine equilibrium (typically in a negative feedback loop) will shift the equilibrium to oppose, or partly offset, the change. In biology, mechanisms that maintain homeostasis play a similar role, and medicine and physiology make heavy use of comparative statics in diagnosing what underlying changes might have led to observed symptoms of disrupted homeostasis. In general, reasoning about structure – which variables are derived from which others – and function in dynamic systems can draw on a rich mix of well-developed applied mathematical modeling tools for predicting equilibria and, if needed, adjustment transients in both deterministic and stochastic dynamic systems. The causal analytics enabled by Bayesian networks and other causal graph models abstracts the structures from such more detailed mathematical models and allows relatively simple but useful modeling of probability relations among levels of variables using much less detailed information.