Health Effects Institute (HEI) (2010). Impact of Improved Air Quality During the 1996 Summer Olympic Games in Atlanta on Multiple Cardiovascular and Respiratory Outcomes. HEI Research Report #148. April, 2010. Authors: Jennifer L. Peel, Mitchell Klein, W. Dana Flanders, James A. Mulholland, and Paige E. Tolbert. Health Effects Institute. Boston, MA. http://pubs.healtheffects.org/getfile.php?u=564
Helfenstein U. (1991) The use of transfer function models, intervention analysis and related time series methods in epidemiology.Int J Epidemiol. Sep;20(3):808-15.
Hernan MA, Robins JM. (2011, 2015).Causal Inference.Chapman and Hall/VCFC Press (forthcoming).
2011 draft available at www.tc.umn.edu/~alonso/hernanrobins_v1.10.11.pdf
Hill AB (1965)."The Environment and Disease: Association or Causation?".Proceedings of the Royal Society of Medicine58(5): 295–300. Imberger G, Vejlby AD, Hansen SB, Møller AM, Wetterslev J (2011) Statistical multiplicity in systematic reviews of anesthesia interventions: A quantification and comparison between Cochrane and non-Cochrane reviews.
Ioannidis JPA. Why most published research findings are false. 2005. PLoS Med 2(8): e124. doi:10.1371/journal.pmed.0020124
Kelly F, Armstrong B, Atkinson R, Anderson HR, Barratt B, Beevers S, Cook D, Green D, Derwent D, Mudway I, Wilkinson P; HEI Health Review Committee. The London low emission zone baseline study.Res Rep Health Eff Inst.2011 Nov;(163):3-79. Krstić G. Apparent temperature and air pollution vs. elderly population mortality in Metro Vancouver.PLoS One. 2011;6(9):e25101
Künzli N, Tager IB. Thesemi-individualstudyin air pollution epidemiology: a validdesignas compared to ecologic studies.Environ Health Perspect. 1997 Oct;105(10):1078-83. Lamm SH, Hall TA, Engel E, White LD, Ructer FH. PM 10 Particulates: Are they the major determinant in pediatric respiratory admissions in Utah County, Utah (1985-1989)?" Ann. Occup. Hyg., 38:969-972, 1994.
Lehrer J. Trials and errors: Why science is failing us. Wired. January 28, 2012. http://www.wired.co.uk/magazine/archive/2012/02/features/trials-and-errors?page=all
Lepeule J, Laden F, Dockery D, Schwartz J. Chronic Exposure to Fine Particles and Mortality: An Extended Follow-up of the Harvard Six Cities Study from 1974 to 2009.Environ Health Perspect. 2012 Jul;120(7):965-70.
Lipsitch M, Tchetgen Tchetgen E, Cohen T. Negative controls: a tool for detecting confounding and bias in observational studies. Epidemiology. 2010 May;21(3):383-8.
Maclure M. Taxonomic axes of epidemiologic study designs: a refutationist perspective.J Clin Epidemiol. 1991;44(10):1045-53.
Moore KL, Neugebauer R, van der Laan MJ, Tager IB. Causal inference in epidemiological studies with strong confounding. Stat Med. 2012 Feb 23. doi: 10.1002/sim.4469.
NHS, 2012. Air pollution 'kills 13,000 a year' says study www.nhs.uk/news/2012/04april/Pages/air-pollution-exhaust-death-estimates.aspx
Ottenbacher KJ (1998) Quantitative evaluation of multiplicity in epidemiology and public health research.Am J Epidemiol 147: 615–619.
Pelucchi C, Negri E, Gallus S, Boffetta P, Tramacere I, La Vecchia C. 2009. Long-term particulate matter exposure and mortality: a review of European epidemiological studies.BMC Public Health. Dec 8;9:453.
Pope CA 3rd. Respiratory disease associated with community air pollution and a steel mill, Utah Valley.Am J Public Health. 1989 May;79(5):623-8.
Pope CA 3rd. Particulate pollution and health: a review of the Utah valley experience.J Expo Anal EnvironEpidemiol. 1996 Jan-Mar;6(1):23-34.
Pope, CA 3rd(2010). Accountability Studies of Air Pollution and Human Health: Where are We Now, and Where Does the Research Need to Go Next? http://pubs.healtheffects.org/getfile.php?u=584
Powell H, Lee D, Bowman A. Estimating constrained concentration–response functions between air pollution and health. Environmetrics. 2012 May;23(3): 228-237.
Rothman KJ, Greenland S. Causation and causal inference in epidemiology.Am J Public Health. 2005;95Suppl 1:S144-50.
Savitz DA. Commentary: A Niche for Ecologic Studies in Environmental Epidemiology. Epidemiology. 23(1):53-54, January 2012.Epidemiology. 23(1):53-54, January 2012.
Sarewitz D. Beware the creeping cracks of bias. Nature. 10 May 2012 485:149
Shindell D, Kuylenstierna JC, Vignati E, van Dingenen R, Amann M, Klimont Z, Anenberg SC, Muller N, Janssens-Maenhout G, Raes F, Schwartz J, Faluvegi G, Pozzoli L, Kupiainen K, Höglund-Isaksson L, Emberson L, Streets D, Ramanathan V, Hicks K, Oanh NT, Milly G, Williams M, Demkine V, Fowler D. Simultaneously mitigating near-term climate change and improving human health and food security.Science. 2012 Jan 13;335(6065):183-9.
Stebbings JH Jr. Panel studies of acute health effects of air pollution. II. A methodologic study of linear regression analysis of asthma panel data.Environ Res. 1978 Aug;17(1):10-32.
Ward AC. The role of causal criteria in causal inferences: Bradford Hill's "aspects of association". Epidemiol Perspect Innov. 2009 Jun 17;6:2
Wittmaack K. The big ban on bituminous coal sales revisited: serious epidemics and pronounced trends feign excess mortality previously attributed to heavy black-smoke exposure.InhalToxicol. 2007 Apr;19(4):343-50.
Yang Y, Li R, Li W, Wang M, Cao Y, Wu Z, Xu Q. The association between ambient air pollution and daily mortality in Beijing after the 2008 Olympics: a time series study. PLoS One. 2013 Oct 18;8(10)
Yim SH, Barrett SR.Public health impacts of combustion emissions in the United Kingdom.EnvironSci Technol. 2012 Apr 17;46(8):4291-6.
Zeger SL, Dominici F, McDermott A, Samet JM. Mortality in the Medicare population and chronic exposure to fine particulate air pollution in urban centers (2000-2005). Environ Health Perspect. 2008 Dec;116(12):1614-9. doi: 10.1289/ehp.11449
Evaluation Analytics for Occupational Health: How well do laboratories assess workplace concentrations of respirable crystalline silica? Introduction
Chapters 8 and 10 have introduced important themes of evaluation analytics: discovering through independent replication of previous work (Chapter 8) and by applying new methods such as modern predictive and causal analytics algorithms to previously collected observational data (Chapter 10) whether published claims are reproducible and whether predicted effects caused by changes in exposures have actually occurred. This chapter provides an example of evaluation analytics in a context where experimentation is possible. It illustrates how a designed experiment with samples having known properties can be used to evaluate how consistently and accurately the laboratory system used to assess compliance of workplaces with occupational safety standards for respirable crystalline silica (RCS) performs in correctly classifying exposure concentrations as above or below a desired level. In this context, causation appears to be clear: concentrations of RCS in airlead to concentrations on air filters sent to laboratories. However, as we shall see, there is enough unexplained noise or random variation in the process so that even control samples with no RCS are sometimes mistakenly identified as carrying significant positive loads of RCS (Cox et al., 2015). Thus, the causes of laboratory-reported values include substantial contributions from measurement errors.
Respirable crystalline silica (RCS), which consists of minute quartz particles (sand), causes increased risk of silicosis in people or animals exposed to sufficiently high concentrations for sufficiently long durations. Chapter 9 discussed some of the key causal mechanisms involved. To reduce or eliminate risk of silicosis in occupationally exposed workers, regulators, employers, employees, and labor organizations have worked together to reduce greatly exposure concentrations of RCS in the air of many workplaces since the 1960s. Figure 11.1 shows that, as hoped, silicosis mortalities in the U.S. have declined dramatically, by over 90% since the 0.10 mg/m3 PEL was established to protect worker health. Evidence from clinical, toxicological, epidemiological, and industrial hygiene studies (Cox, 2011), as well as the historical record in Figure 11.1, suggest that it has been effective in doing so.
Despite this dramatic progress, silica exposures in some workplaces in recent decades have remained far above the current (and former) PEL levels. Such lack of compliance with the 0.10 mg/m3 PEL can harm human health. As noted by the Centers for Disease Control and Prevention (CDC) (64 MMWR 23, June 19, 2015):
Figure 11.1. Annual silicosis-associated mortalities have decreased by over 90% since the 0.10 mg/m3 PEL was established.
Source: U.S. Centers for Disease Control and Prevention, 2016
"Results indicate that despite substantial progress in eliminating silicosis, silicosis deaths continue to occur. Of particular concern are silicosis deaths in young adults (aged 15-44 years). These young deaths likely reflect higher exposures than those causing chronic silicosis mortality in older persons, some of sufficient magnitude to cause severe disease and death after relatively short periods of exposure. A total of 12 such deaths occurred during 2011-2013, with nine that had silicosis listed as the underlying cause of death."
From a risk management perspective, it is natural to wonder whether it is possible to further reduce silicosis mortalities and morbidities through improved compliance monitoring and enforcement and/or by reducing the current PEL. Would a lower PEL prevent more deaths, or, to the contrary, would rigorously enforcing the current 0.1 mg/m3 limit achieve all the human health benefits available from reducing exposures? To find out, it is important to consider how accurately workplace exposures are currently monitored and enforced and the possibilities for improving compliance with PELs. A regulation that tells employers “Do not exceed concentration C” can only be effective if it is possible to determine, with useful reliability, when a workplace is in compliance. As permitted concentrations become lower, determining compliance can become more difficult, requiring increasingly accurate and precise laboratory measurements. Enjoining employers to comply with standards for which compliance cannot be determined reliably risks ineffective action, diverting limited resources to false positives (acting to reduce workplace exposures based on false findings that they are too high) or failing to act due to false negatives (i.e., laboratory findings mistakenly indicating that no action is needed). Such errors and potential for useless activity arise whenever occupational exposures are reduced until they are comparable to the “noise,” or random measurement error and variability, in laboratory results – or, conversely, whenever the random variability in laboratory results is large enough to obscure the effects that they seek to detect. The science-policy question of how best to set PEL concentrations to protect worker health then becomes complicated by the practical reality that compliance with target concentration levels cannot easily be determined.
The remainder of this chapter examines how well today’s workplace RCS concentrations can be determined from the laboratory results that are currently used in determining and enforcing compliance. A 2015 article described an experiment in which filters with known loads of RCS in different matrices (and also as pure crystalline silica samples) were sent to five different commercial laboratories to determine how accurately and reliably they measured these known loads (Cox et al., 2015). The overall results were striking: the laboratories did not consistently discriminate between concentrations that differed by a factor of 2, and even filters with no crystalline silica load were sometimes misidentified as containing substantial RCS concentrations. However, these findings came from an artificial experiment, leaving open the possibility that laboratories perform better in practice on real samples. The analysis in this chapter tests that possibility by quantifying the variability in results from commercial laboratories, including laboratories accredited by the American Industrial Hygiene Association (AIHA) Industrial Hygiene Proficiency Analytical Testing (IH-PAT) program (www.aihapat.org). These include laboratories used by numerous employers and other organizations to determine whether workplace exposures comply with PELs.
Data and Methods
Employers and other organizations often send workplace air sampling filters to accredited laboratories to determine current workplace air concentrations of RCS. The laboratories, in turn, may use several different analytical methods, such as X-ray diffraction, infrared spectroscopy, and calorimetry to estimate the quantity of RCS on a received filter; of these, X-ray diffraction is widely considered one of the most accurate and reliable analytic methods. All data discussed in this chapter were derived by X-ray diffraction. The laboratory returns to the submitting entity the estimated quantity of RCS (mg) on the filter from a given volume of air sampled. If the values (“lab results”) returned are sufficiently high, interventions to reduce workplace air concentrations of RCS may be triggered.
To maintain IH-PAT accreditation, laboratories must meet AIHA-specified criteria for proficiency in estimating quantities of RCS (and other substances) sent to them for analysis. This is done as follows. Four times per year (each being called a “round”), AIHA sends to each participating lab four spiked sample filters, prepared from continuously agitated homogeneous suspensions with four different known concentrations; thus the filters received by different laboratories contain approximately the same known loadings of RCS, called samples. Figure 11.2 shows a photograph of such samples (front row, clearly identifying the interfering substances) as well as typical real-world samples (back row) which display only unique sample numbers. As explained in AIHA methods documentation, the RCS analytes measured in these accreditation tests consist of “Free silica (quartz) on four 5.0 µm 37-mm PVC (polyvinyl chloride) filter samples containing differing silica concentrations and include a background matrix, on a rotating basis of coal mine dust, talc, calcite, or a combination” (AIHA, 2016).
Each laboratory analyzes the IH-PAT-prepared samples and reports the estimated quantities of RCS back to IH-PAT. The reported results from different laboratories for corresponding samples in the same round should be the same if there are no errors or variability in the process. In practice, as discussed further in the Results section, there is considerable variability in the estimated quantities of RCS for these matched samples (although less than in earlier decades) consistent with previous literature (Harper et al., 2014; Shulman et al., 1992; Abell MT, Doemeny, 1991). The key statistical criterion for determining a laboratory’s proficiency is that its results must fall within three standard deviations of the IH-PAT “assigned value” (or “reference mean”). A laboratory is rated non-proficient if it has failing scores in 2 of the last 3 consecutive rounds (i.e., 2 of the last 12 consecutive test samples).To determine the assigned value for a given sample in a given round, IH-PAT first identifies a subgroup of the participating labs, termed reference labs, based on prior analytical performance. Using the reference labs’ reported results, IH-PAT Winsorizes any outliers (replacing them with less extreme values, see https://cran.r-project.org/web/packages/robustHD/robustHD.pdf) and calculates arithmetic means and standard deviations of the reference lab results.
Figure 11.2. Real-world (back row) and AIHA-prepared (front row) filters sent to laboratories for analysis.
As described by OSHA (https://www.osha.gov/dsg/etools/silica/faq/faq.html, accessed 6/30/2016):
The PAT program is designed to help consumers select laboratories that are proficient. In the PAT program analyses of quartz, the "true" values against which a laboratory's results are compared are based on results from reference laboratories that are a subset of the participating laboratories. Assuming that the PAT samples were made from accurately delivered consensus reference material and that the participants all used the same techniques, instrumentation and methodology, and that the samples are not otherwise flawed so as to introduce bias, the best accuracy that can be achieved by consensus analyses is limited by the standard error of the precision of that analysis [SD/(n)½, where SD is the standard deviation in the results among the n reference labs]. …The current method of PAT quartz sample generation is by aerosol generation using "5 micron" Min-U-Sil 5 without cyclones. In addition to any errors in the generation process, this "total dust" approach introduces a sampling error that may not duplicate the sampling error associated with the use of a cyclone.
In the PAT program, these generation and sampling errors are recognized as significant and are evaluated in statistical tests conducted on sub-batches and batches of PAT samples by the contract laboratory that prepares them. …The results obtained by participants in the PAT program therefore include both the analytical error the participating laboratories introduce and an unknown but potentially large amount of error introduced in the generation and sampling of the aerosol. These latter errors may vary batch to batch.”
Multiple years of data on the estimated quantities of RCS returned by different laboratories in response to the spiked samples sent out via the AIHA-PAT program are available on-line in .pdf format at www.regulations.gov/#!documentDetail;D=OSHA-2010-0034-4188 from an AIHA-PAT program submission to OSHA; they are also available as Excel files from the present author. Table 11.1 shows the layout of the data used in all subsequent analyses. The “OrgId” code is a numerical code that uniquely identifies each laboratory (based on our re-coding of the original much longer codes). Data from 26 AIHA-PAT accredited labs (one per row) and for the most recent 2 rounds for which data are available (one per column) are shown, since the most recent data are assumed to be most representative and relevant for current testing conditions. Round 194 took place in July of 2013. (Including more years of data reinforces our findings, but risks losing relevance; Harper et el., 2005, discuss key changes over time in the IH-PAT program.) The numbers in Table 11.1 represent estimated quantities (mg) of RCS returned by each lab (row) for each sample and round (column); empty cells indicate missing values, e.g., because a lab dropped out of the AIHA-PAT program.
Table 11.1. Data layout for AIHA-PAT data
In analyzing these data, we emphasized exploratory and descriptive data analysis and non-parametric methods to avoid introducing potentially erroneous and biased modeling assumptions. The following sections present results of statistical plots of conditional empirical cumulative frequency distributions, nonparametric (smooth) regression, and Spearman’s rank correlationsto compare and visualize laboratory-specific results vs. approximate true values, i.e., the reference values.
Results Figure 11.3 shows individual laboratory estimates of RCS amounts for each reference (estimated true) value. Each small circle represents one result returned by a laboratory. Reference values are on the x axis and the laboratory’s corresponding estimated values are on the y axis. If all estimates were perfectly correct and there were no errors in the process, all data points would fall on the line shown (the line of perfect calibration), which equates estimated (y axis) and reference (x axis) values. In practice, individual laboratories return estimated RCS levels that vary widely around the reference values, as indicated by the vertical spread of the individual laboratory results around the line. For example, the spiked samples with a reference value (i.e., estimated true value) of just under 0.14 mg elicited individual laboratory estimates ranging from about 0.06 to over 0.18 mg. Since there are only 26 participating accredited labs in this study, this wide range implies that an employer or other entity who receives a laboratory report of, say, 0.06 or 0.09 mg (the two lowest individual laboratory estimated values for this reference value) cannot be reasonably sure (e.g., 95% confident) that the true value is less than 0.10 mg.
Figure 11.3. Individual laboratory values (circles) are widely spread around reference mean values (line). All values are in mg.
Conversely, the wide vertical scatter of estimates around the reference values that are below 0.10 mg implies that receiving a lab result of 0.12, or even 0.18 mg, does not imply that the true value exceeds 0.10 mg.
Figure 11.4 provides a different perspective on this variability in laboratory results by showing laboratory-estimated values (rounded to the nearest mg) on the x axis and corresponding reference values on the y axis. A nonlinear regression (nonparametric smoother) curve is fit to this scatterplot. Very high sample values returned by laboratories tend to over-estimate the reference values (e.g., a sample value of 0.24 mg returned by a laboratory corresponds to an average reference value of about 0.20 mg, as estimated by the regression curve); conversely, very low individual laboratory values tend to under-estimate corresponding reference values. There is very substantial variability in the reference values corresponding to a single estimated sample value, as shown by the vertical range of results (small circles) for specific sample values on the x axis.
Figure 11.4. Estimated true values (y axis) are widely spread around sample values (x axis). All values are in mg.
The small circles in Figures 11.3 and 11.4 show that, collectively, laboratory results are quite variable, despite the care exercised by the AIHA-PAT program in preparing spiked samples that should all yield closely similar values in the absence of laboratory error. It is natural to wonder whether this might be due to a few individual laboratories that are consistently higher or lower than the rest. To find out, we used Spearman’s rank correlations to test whether laboratories that gave higher (or lower) estimates than most others in one quarter also tended to do so over time. The test was carried out by computing the 28 (= (8*7)/2) pairs of ordinal correlations between rankings of labs based on sizes of RCS estimates (for the same reference value) in each of the 8 rounds in Table 11.1. Six of the 28 pairs if Spearman’s rank correlations differed significantly from zero at the conventional p = 0.05 significance level, and all six were positive (with numerical values between 0.39 and 0.73). This provides significant evidence that laboratories that give relatively high or low results compared to others in one quarter are more likely to do so again in another quarter. However, the effect is relatively small, and the wide range of variability in sample RCS estimates shown in Figures 11.3 and 11.4 reflects variability that is more pervasive than one or a few laboratories.
Discussion The practical implications of the variability in laboratory estimates of RCS quantities are potentially important to employers, employees, and regulators who rely on such results to determine compliance and need for interventions. If a laboratory returns an estimated value of 0.06 mg for a submitted air sample filter, for example, then Figure 11.4 shows that the corresponding reference values (and hence the true values that they approximate) range from a low of about slightly above 0.06 mg to a high of about 0.14 mg (based on the empirically observed range of reference values that generated estimated values of 0.06 mg when sent to the 26 accredited laboratories). One of the four distinct reference values shown for an estimated value of 0.06 mg is 0.14 mg, and all four are higher than the estimated value of 0.06 mg, so there can be little confidence based on a returned value of 0.06 mg that the true value is less than a value corresponding to approximately 0.10 mg. Similarly, a laboratory estimate of 0.13 could correspond to a true value anywhere between about 0.06 mg and 0.20 mg, while an estimate of 0.18 mg could correspond to a true value anywhere between about 0.08 and 0.20 mg. Thus, no laboratory result between 0.06 and 0.18 mg can be relied on by the employer, employee, or regulator to confidently (e.g., with 95% confidence, or even 90% confidence) discriminate between true (or reference) values above and below 0.10 mg. To the contrary, returned values between 0.06 mg and 0.18 mg (spanning most of the design range for spiked sample values, which run from 0.05 to 0.20 mg) convey very little information about the probable true value, since any returned value between 0.06 mg and 0.18 mg is compatible with a wide range of true values.
That laboratories provide such relatively uninformative estimates for individual samples in no way contradicts or undermines the fact that, on average, higher sample values do indeed correspond to higher reference values, as shown by the nonparametric regression curve in Figure 11.4. However, individual employers and employees cannot get the benefits of this useful aggregate relation, since they receive only the individual results of submitted sample filters, and these individual results are too variable to provide trustworthy indications of whether the sampled workplaces are above or below a given limit. Acting to reduce RCS exposures (e.g., by increasing dust controls or administrative controls and use of respirators), or failing to take such measures, on the basis of laboratory-measured values does not provide a reliable approach to taking action when, and only when, appropriate.
Conclusions Various authoritative agencies attribute great value to the information provided by laboratory analyses of RCS. For example, the US Occupational Health and Safety Administration states that “Analytical results on the quartz content of the air samples are necessary to evaluate whether the OSHA PEL is exceeded” (www.osha.gov/dsg/etools/silica/faq/faq.html). AIHA states that “The purpose of proficiency testing is to provide interested parties with objective evidence of a laboratory’s capability to produce data that is both accurate and repeatable for the activities listed in its scope of accreditation. A laboratory’s competence can be demonstrated through favorable proficiency testing data. This is important to clients, potential customers, accreditation bodies, and other external entities.” Likewise, the International Standards Organization (ISO), for which the IH-PAT program provides conformity assessment, states that “The need for ongoing confidence in laboratory performance is not only essential for laboratories and their customers but also for other interested parties, such as regulators, laboratory accreditation bodies and other organizations that specify requirements for laboratories” (ISO/IEC, 2010). These statements about the importance of trustworthy laboratory performance, however true, stop short of addressing the fundamental challenge revealed by the data in this study: the variability in laboratory results is large enough compared with a 0.10 mg/m3 limit so that results returned by laboratories do not reliably indicate whether workplace RCS concentrations are above or below that limit. The current (as of October, 2013) AIHA-PAT program considers laboratories proficient as long as their results fall within an interval of six standard deviations (three in each direction) of the mean for reference laboratories for at least 75% of the silica samples at least 2/3 of the time (i.e., in at least 2 of each 3 consecutive rounds): “IHPAT participant results are rated acceptable or unacceptable for each unique analyte sample number. … A passing score is 75% or more acceptable results for an analyte group. A participant is rated proficient for the applicable IHPAT analyte group if the participant has a passing score for the applicable IHPAT analyte group in two (2) of the last three (3) consecutive PT rounds.” (www.aihapat.org/Programs/IHPAT/Documents/IHPAT%20Scheme%20Plan%20R2.pdf , p.16) For the IH-PAT program data in Table 11.1, Figures 11.3 and 11.4 show that this is not a sufficiently demanding criterion to assure usefully accurate and reliable results.
Previous research has noted that large differences among laboratories contribute to variability in PAT program estimates of the RCS content in samples (Shulman et al., 1992), although variability has declined since the introduction of the PAT program in 1972, in part due to changes in RCS samples and in laboratory procedures (ibid and Harper, 2014). Maciejewska (2006) found that regular use of quality control methods for free silica determination was positively associated with proficiency of laboratories, suggesting that the variability of PAT estimates for RCS can potentially be reduced by such methods. Thus, although Figures 11.3 and 11.4 show that current (2013) variability in laboratory estimates is too great to discriminate reliably among reference concentrations in the range of approximately 0.06 mg to 0.18 mg (and hence to assess compliance with PELs for workplaces with concentrations between about half and about double a PEL corresponding to 0.10 mg), it is plausible that this variability could be reduced by stricter quality control for laboratories making RCS determinations, as OSHA’s 2016 final rule reducing the PEL from 010 to 0.05 mg/m3 requires, but demonstrating that it has been successfully accomplished appears to be a prerequisite for obtaining reliable results (Lee et al., 2016). More stringent statistical quality requirements may also be essential for obtaining more useful results. For example, instead of requiring only that laboratories come within three standard deviations of the reference value on 75% or more of samples in at least 2 of 3 consecutive rounds, AIHA might adopt the NIOSH criterion that “the method must provide results that are within ±25% of the expected (“true”) values at least 95 times out of 100” (Ashley, 2015). Such accuracy would require greatly reducing the variability shown in Figures 3 and 4, where far more than 5% of results lie further than 25% away from the expected values (given by the regression curves). This might be accomplished either by having individual laboratories make greater use of quality control measures for RCS determinations or by modifying compliance determination rules to make greater use of averages of RCS values assessed by multiple laboratories, to adjust for the fact that average values are relatively reliable, but individual sample values are currently too variable to meet accuracy criteria such as NIOSH’s.
Our findings also have potentially important implications for monitoring and enforcement of the new (2016) OSHA limit of 0.05 mg/m3 and the new action level of 0.025 mg/m3. In general, as quantified in earlier studies (Cox et al., 2015), current laboratory procedures do not reliably quantify crystalline silica levels that differ by a factor of 2 (e.g., 0.10 vs. 0.05 mg/m3), and performance was not improved at lower exposure levels, so the information obtained from laboratories today is inadequate for monitoring and enforcing compliance even with the 0.10 mg/m3 PEL (Lee et al., 2016), and a fortiori for the new, lower PEL (Cox et al.,2015;Lee et al., 2016). Until laboratory practices and/or statistical protocols for determining compliance are radically improved to give demonstrably more accurate results, the current high prevalence of false-positive and false-negative conclusions about compliance implied by Figure 11.4 of this study can be expected to inhibit effective allocation of resources to improve and protect worker health. Compliance with the recent 0.10 mg/m3 or the current 0.05 mg/m3 PEL cannot be determined reliably without substantial improvement in current laboratory and statistical practices. Until these improvements are made and credibly demonstrated, enforcement activities that are based primarily on random noise or error in laboratory results will provide neither incentives nor capability to continue reducing out-of-compliance levels of exposure that could threaten worker health.
References Abell MT, Doemeny LJ. Monitoring the performance of occupational health laboratories.Am IndHygAssoc J. 1991 Aug;52(8):336-9.
AIHA (2016). Scope of Accreditation to ISO/IEC 17043:2010. AIHA Proficiency Analytical Testing Programs. www.a2la.org/scopepdf/3300-01.pdf
Ashley K. NIOSH Manual of Analytical Methods5thEdition and Harmonization of Occupational Exposure Monitoring.GefahrstReinhaltLuft. 2015;2015(1-2):7-16.
Cox LA Jr, Van Orden DR, Lee RJ, Arlauckas SM, Kautz RA, Warzel AL, Bailey KF, Ranpuria AK. How reliable arecrystallinesilicadust concentration measurements?Regul Toxicol Pharmacol. 2015 Oct;73(1):126-36.
ISO/IEC (2010). ISO/IEC 17043:2010(en) Conformity assessment — General requirements for proficiency testing www.iso.org/obp/ui/#iso:std:iso-iec:17043:ed-1:v1:en
Harper M, Sarkisian K, Andrew M. Assessment of respirable crystalline silica analysis using Proficiency Analytical Testing results from 2003-2013.J Occup Environ Hyg. 2014;11(10):D157-63
Lee RJ, Van Orden DR, Cox LA, Arlauckas S, Kautz RJ. Impact of muffle furnace preparation on the results ofcrystallinesilicaanalysis.Regul Toxicol Pharmacol. 2016 Jun 16.
Maciejewska A. [Analysis of the competences of workplace inspecting laboratories for the determination of free crystalline silica (FCS), based on proficiency testing results].Med Pr. 2006;57(2):115-22. Polish.
Shulman SA, Groff JH, Abell MT. Performance of laboratories measuring silica in the Proficiency Analytical Testing program.Am IndHygAssoc J. 1992 Jan;53(1):49-56.