Fig. 4.7. Plots of simulated variable Y = max(0, 10X + error) vs. X (left) and of the simulated DSM ratio Y/X vs. X (right), where error ~ U[-10, 10]). The declining simulated DSM ratios on the right are compatible with the simulated linear metabolism on the left; they reflect the arithmetic fact that small denominators are associated with large ratios and large denominators are associated with small ratios.
Tying up some loose ends: Joint frequency distributions of variables and intra-individual variability of DSM ratios Price et al. (2012) and the Berkeley group (Kim, 2006b, Rappaport et al., 2010, 2013) have debated how to best to define the “background” levels of metabolites to subtract from exposure-dependent increases; whether it is better to use arithmetic or geometric means; and how to estimate levels of benzene exposure from urinary benzene measurements when no measurements in air are available. In light of the wide variability in measured metabolite concentrations at each air benzene level in Figure 4.6 (with substantial variability persisting even if the data are log-transformed), the following comments consider the implications of such distributions for previously debated issues.
Table 4.5 displays selected benzene metabolite concentrations in urine from workers at Factory 3, the control factory without occupational benzene exposure for different levels of urinary benzene (UB). It is clear that there is substantial variability in metabolite concentrations for these occupationally unexposed workers; indeed, these distributions of values overlap substantially with the corresponding distributions of metabolite levels for exposed workers from Factory 1 (CA interquartile range from 10.8 to 30.2 for Factory 1 workers compared to 7.4 to 18.2 for Factory 3; HQ interquartile range from 9.7 to 27.0 for Factory 1 compared to 4.5 to 10.6 for Factory 3; MA interquartile range from 4.4 to 24.7 and MA 10th and 90th percentiles of 2.2 and 44.4 for Factory 1 compared to 0.61 to 1.8 for Factory 3. Only for MA is the interquartile range, i.e., the interval between the lower and upper quartiles of the frequency distribution of values, clearly higher for workers in Factory 1 than the values shown in Table 4.5 for workers from Factory 3.) This suggests that simply subtracting off any single number to represent the background level of a metabolite is an over-simplification: background levels of metabolites are better modeled as following distributions – or, more accurately, a joint frequency distribution – in the population.
Table 4.5. Mean values of each benzene metabolite concentration CA (M), HQ (M), and MA (M) and number of person-days of data, N, for each level of urinary benzene concentration (UB in nM) for workers in the control Factory 3.
UB rounded to nearest nM
More generally, the variables may be viewed as having a joint frequency distribution. Figure 4.8 shows the structure of a Bayesian network (BN) that represents this empirical joint distribution via products of marginal frequencies of values at input nodes (those with only outward-pointing arrows, such as Gender and AB) and conditional frequencies of values, given the values of their inputs, at all other nodes.
Fig. 4.8. Bayesian network (BN) learned by bnlearn from data for benzene-exposed workers in Factories 1 and 2. An arrow between two variables (in either direction) indicates that they are statistically dependent, i.e., informative about each other.
Roughly speaking, the BN structure in Figure 4.8 identifies two variables as informative about each other, even after conditioning on other variables, if and only if they are linked by an arrow (in either direction, by Bayes’ rule). Otherwise, they are conditionally independent of each other. The BN in Figure 4.8 was discovered automatically from the Tianjin data by the bnlearn package in R, accessed via the CAT add-in for Excel. (Smoking was excluded because of missing data values.) Figure 4.8 sheds light on how best to estimate air benzene levels (AB) from other measured variables. It implies that AB should be estimated from MA and UB levels, as well as from the factory at which a worker is employed. UB levels depend on CA (which depends on MA and factory) as well as AB. Attempts to infer AB from UB should therefore adjust for levels of other variables, especially MA. That the factory at which a worker is employed is identified as an independent predictor of MA and CA as well as AB suggests that different factories might have different practices (perhaps leading to more or less dermal exposure) or that other local differences might provide explanations for metabolite levels that are unrelated to the occupational inhalation exposures we have focused on.
Figure 4.9 visualizes the result that both UB and MA are informative about AB by estimating contours of equal AB concentrations for different combinations of UB and MA. The non-parametric regression method used in Figure 4.9 to estimate contours from data on combinations of MA, UB, and AB values for exposed workers is the default spline surface-fitting algorithm in the commercial Statistica package, but other non-parametric regression methods (distance-weighted least squares) and polynomial regression yield similar conclusions: AB depends on MA as well as UB. A high level of MA paired with a low level of UB still predicts a high level of AB. If AB depended only on UB, then the contours in Figure 4.8 would be vertical lines, which is not the case.
Table 4.6 gives a more quantitative assessment of the relative importance of different variables for predicting AB using a standard non-parametric machine learning algorithm (random forest, implemented via the R package randomForest with all options at their default values, accessed from Excel via the CAT add-in). The metric of importance in Table 4.6 is the estimated increase in mean squared prediction error caused by dropping each variable. By this criterion, MA is the single most important predictor, and calibration models in the literature that depend only on Factory and UB omit the three most important variables (MA, CA, and HQ), any or all of which would allow more accurate predictions of air benzene than UB.
Fig. 4.9. Air benzene levels (ppm) are best predicted for workers at each factory from MA (M) (vertical axis) as well as UB levels (nM) (horizontal axis). AB is likely to be high if MA or UB is high. Calibration models that assume that AB depends only on UB would imply vertical contours, which are not observed. Contours are estimated by regression splines in Statistica® (all settings at default values).
Table 4.6. Estimated importances of different variables for predicting air benzene (AB) using a random forest ensemble of classification and regression trees. MA is the single most useful predictor, reducing mean squared prediction error by about 4 times as much as UB.
From most to least important, the relative importances of these potential causes are as follows:
A variable's importance is measured here as the increase in mean squared error in predicting < AB > if the variable is dropped.
A different way to quantify this is to examine how much of the total variance in air benzene is explained by each subset of predictors using the randomForest package via the Sensitivity Plots tab of CAT. The results are that Factory alone explains 29% of the total variance; UB and Factory together explain 40%; MA and Factory explain 56%; MA and UB explain 64%; and UB and MA and Factory explain 65%. As implied by the structure of the Bayesian network in Figure 4.7, including the other variables as predictors, such as Gender, Age, CA and HQ, does not significantly further improve prediction of AB once UB, MA and Factory are known. Only about 65% of the variance in AB can be explained by the other variables, and including MA, UB, and Factory achieves this. For comparison, a main-effects multiple linear regression model using the same variables explains about 62% of the variance in AB (R2 = 63%, adjusted R2 = 62%), so the additional flexibility of the random forest non-parametric model ensemble to represent nonlinearity and interactions among variables contributes relatively little improvement compared to a straightforward linear model.
Given the identified importance of MA as a predictor of AB (Table 4.6 and Figure 4.9) and HQ and CA (Figure 4.8), it is worth examining how AB, CA, and HQ vary with MA. Figure 4.10 uses scatter plots of these variables on the vertical axes (AB in the upper panel, HQ and CA in the lower panel) to visualize the answer. Even ignoring the less important predictors UB and Factory, it is clear that the nonparametric regression curves through these scatter plots are approximately linear for MA < 100 M, corresponding roughly to AB < 10 ppm. For HQ, this linearity extends over the entire range of observed MA values. There is no evidence of any departure from linearity for these metabolites at benzene concentrations less than 10 ppm, and no evidence of low-dose supra-linearity in production of HQ or CA.
A final question of great scientific and public healthy interest is the extent to which different individuals consistently metabolize benzene differently. If a DSM ratio such as UB/AB or a metabolite ratio such as MA/PH is exceptionally high or low for an individual on the first day it is measured, is it likely to be exceptionally high or low again the next time it is measured? Without presenting details, we can easily summarize the empirical answer: comparing the ratios on different days for individuals for whom measurements were collected more than once shows that the answer is no. The ratios on one day are approximately independent of ratios on successive days of measurement. The wide scatter in values of ratios appears to be due not mainly to inter-individual heterogeneity, with some individuals being consistently more or less
Fig. 4.10. Metabolite MA (M) (horizontal axes) is approximately linearly related to air benzene (AB) in ppm (vertical axis, upper panel) and to metabolites HQ and CA (vertical axis, lower panel) for all AB values < 10 ppm and MA values < 100 M. Nonparametric (lowess) regression curves are shown for each scatter plot.
efficient metabolizers than others, but rather to day-to-day random variation in measurement results. This emphasizes again the importance of accounting for the distributions of these random variables in understanding and quantifying the effects of benzene exposures on benzene metabolites.
DISCUSSION AND CONCLUSIONS Figures 4.2 to 4.10 provide a fresh look at data on low-dose metabolism of benzene (below 10 ppm of benzene in air) in workers from Tianjin China using plots and non-parametric analyses to visualize how metabolite levels vary with benzene concentrations in air. Our main conclusions, as they relate to previous discussions and debates about how to interpret the Tianjin metabolite data, are as follows.
The data provide no reason to reject the hypothesis that benzene metabolism is linear at benzene concentrations below 10 ppm (Figures 4.2, 4.3, 4.4, 4.5, and 4.10).
Price et al. and the Berkeley group (e.g., Rappaport et al., 2013) appear to agree that the Berkeley group’s previously published conclusions about supra-linear metabolism at low concentrations of air benzene are sensitive to assumptions about how “background” exposure levels should be defined, while disagreeing on which definition is best justified.
The Berkeley group appears to be correct that average dose-specific metabolism (DSM), i.e., concentration of metabolite per ppm of benzene in air, is a decreasing function of its denominator, ppm of benzene in air (Figure 4.6). The larger its denominator, the smaller the DSM ratio. Specifically, Figure 4.6 confirms the finding of Kim et al. (2006b) that “Mean trends of dose-specific levels (micromol/L/ppm benzene) of E,E-muconic acid, phenol, hydroquinone, and catechol all decreased with increasing benzene exposure… [and most] of the reductions in dose-specific levels occurred below about 3 ppm for each major metabolite.”
However, this arithmetic relationship between DSM ratios and their denominators does not imply “that benzene metabolism is highly nonlinear with increasing benzene exposure above 0.03 ppm, and that current human toxicokinetic models do not accurately predict benzene metabolism below 3 ppm” (Kim et al., 2006b) or “provide extremely strong statistical evidence that benzene is metabolized to PH and MA via two enzymes rather than one enzyme, and that the putative high-affinity enzyme is active primarily below 1 ppm” (Rappaport et al., 2010). Instead, this pattern is entirely consistent with linear metabolism (Figures 4.6 and 4.7).
To us, it seems plausible that the reason the Berkeley group’s putative high-affinity enzyme (Kim et al., 2006b, Rapport et al., 2010) still has not been found after over a decade of continued research and publications is that it does not exist: the declining DSM pattern illustrated in Figure 4.6 is readily explained by the algebra of random variables and the presence of substantial measurement error variance (i.e., DSM ratios are negatively correlated with their denominators even if mean metabolite production is proportional to air benzene), not on any underlying toxicological phenomenon.
Our main conclusion is thus that there is actually no necessary conflict between the claim that low-dose metabolism appears to be linear (or at least that the Tianjin data provide no clear reason to think otherwise) and that DSM ratios are decreasing with air benzene: both can be true. We have also briefly examined other aspects of the Tianjin data, concluding that the distributions of levels of metabolites other than MA among workers in Factory 1 (occupationally exposed to benzene) and Factory 3 (not occupationally exposed to benzene) overlap substantially, and that unmeasured levels of benzene in air can be reconstructed much better from measured levels of MA (ideally combined with information on UB an factory) than from measured levels of UB.
The analyses we have presented emphasize the value of plotting and inspecting relationships among variables in raw data without imposing modeling assumptions and toxicological interpretations upon them. Examining the data via non-parametric methods, including interaction plots, scatter plots, non-parametric regression, Bayesian network-learning algorithms, and random forest model ensembles, has shown that metabolism of benzene at concentrations below 10 ppm is linear to an excellent approximation (e.g., Figure 4.10 and Table 4.4). Declining DSM levels as a function of air benzene are not inconsistent with this linear metabolism (e.g., Figure 4.6). We hope that these observations will help to resolve some of the controversies in recent literature on low-dose metabolism of benzene. We join the Berkeley group (Thomas et al., 2014) in encouraging other practitioners to apply non-parametric descriptive analytics methods, as well as data visualization, to help avoid and resolve confusions and controversies that arise from the use of models and modeling assumptions of unknown validity. The descriptive analyses in this chapter illustrate the potential value of non-parametric methods in explaining and resolving controversies in the interpretation of benzene metabolism data.