Department of Health Seal

TGM for the Implementation of the Hawai'i State Contingency Plan
Section 4.1
SAMPLING THEORY AND VARIABILITY OF CONTAMINANT CONCENTRATIONS IN SOIL

4.1 SAMPLING THEORY AND VARIABILITY OF CONTAMINANT CONCENTRATIONS IN SOIL

4.1.1 LARGE-SCALE AND SMALL-SCALE VARIABILITY

The term "large-scale" is used in this document to describe variability in mean contaminant concentrations between distinctly different areas of a site, such as the "spill area" DUs, "exposure area" DUs, and "perimeter area" DUs described in Section 3. The identification and characterization of such areas is often an objective of an environmental investigation. The term "small-scale" is used to describe variability in mean contaminant concentrations below the designated scale of interest. This includes variability at distances near discrete soil samples or individual increments as well as within an individual discrete sample or increment collected. Small-scale variability can be highly random in nature and unrelated to large-scale trends of interest. While it is important to capture and represent small-scale variability in a sample collected to represent a DU, understanding the precise nature of small-scale variability within a DU is ultimately unknowable and not pertinent to the objectives of the investigation.

The concentration of a contaminant in soil will vary based on the mass of soil tested. A single value would be reported if the entire DU mass of soil within a targeted exposure area could be collected, extracted and analyzed as a single sample. The value reported represents the true mean concentration for the volume of soil as a whole. The concentration of the contaminant will vary above and below the true mean if smaller subsets of the soil are tested.

For example, a single mean contaminant concentration will represent a targeted Spill Area or Exposure Area DU (Figure 4-1). If the DU was divided into four subareas for independent testing, the concentration of targeted contaminant can be expected to be higher in some soil volumes (red blocks) and lower in others (yellow blocks; see Figure 4-1). Variability can be expected to increase as the area is divided into smaller and smaller soil volumes for testing. This distributional heterogeneity ultimately extends down to the scale of individual, adjacent molecules, with the concentration of the contaminant being 100% in one molecule and 0% in the other. At this extremely small scale, the simple question of the "maximum" concentration of a contaminant in soil is therefore very straightforward; it’s either 100% (if present) or 0% (if absent).

Figure 4-1. Variability of Mean Contaminant Concentration within Progressively Smaller Areas and Volumes of Soil within an Initially Designated DU
At some scale contaminant concentration variability will become random and unrelated to larger-scale trends. Variability at the scale of the mass of soil typically collected as a "discrete" sample (e.g., 100-200 g) and tested by the laboratory (e.g., 1-30 g) clearly falls into this category (Figure 4-2).

Figure 4-2. Mass of Soil Typically Tested by a Laboratory
A) Metals (1 gram), B) Volatile organic compounds (5 grams; approximate mass to fill a soda bottle cap), and C) Organic chemicals (10-30 grams)

Keep in mind that the true size of a discrete sample is the actual extraction and analysis mass removed from the original field sample at the laboratory. For example, the standard commercial lab subsample masses are: 0.5 grams for Hg; 1 gram for metals; 5 gram for VOCs; 10 grams for dioxins; and 30 grams for TPH, pesticides, and PAHs. For comparison, the cap of a soda bottle holds approximately 5 grams of soil which is the size of a laboratory subsample tested for VOCs (Figure 4-2).

This scale of variability was demonstrated in a field study carried out by the HEER Office in 2014 (HDOH, 2015, b). Hundreds of discrete soil samples were tested at each of three study sites. Figure 4-3 depicts a study area sampled within a former radio broadcasting facility known to be heavily contaminated with polychlorinated biphenyls (PCBs; Study Site C). A 6,000 ft2 area was selected for characterization as a hypothetical Exposure Area DU. Multi Increment sample replicate data indicated a mean PCB concentration for the area of 104 mg/kg (95% UCL 346 mg/kg). The high Relative Standard Deviation for the replicate data (138%) indicates significant heterogeneity and a need to either increase the number of increments used and/or subdivide the original DU into smaller DUs for more precise characterization.

Figure 4-3. Study Site C in 2014 HDOH Field Investigation of Discrete Sample Variability

Soil types: A) Native soil, B) Mixed fill and native soil, C) Fill. Electrical equipment was formerly stored in the area underlain by fill material. Dashed lines indicate hypothetical division of original study site area into smaller DUs for more detailed characterization.

The site history, as well as discrete soil sample data collected as part of the study, suggests an overall higher concentration of PCBs in the eastern half of the study site where electrical equipment was formerly stored. This area is observable in the field by the presence of reddish fill material. This information could be used to divide the original study site into smaller DUs for more detailed characterization, if needed, for decision making purposes (see dashed lines in Figure 4-3).

Figure 4-4. Example "Inter-Sample" Variability of PCB Concentrations in Soil
Data for processed, discrete samples collected within a 50 cm radius of a grid point (HDOH, 2015).

Figure 4-5. Example "Intra-Sample" Variability of PCB Concentrations in Soil
Data for ten subsamples tested from a single discrete soil sample (HDOH, 2015).

An attempt to use discrete soil sample data to better characterize these areas could be highly misleading. As depicted in Figure 4-4 and Figure 4-5, concentrations of PCBs in discrete samples collected within a few feet of each other ("inter-sample" variability) as well as concentrations of PCBs repotted within individual samples ("intra-sample" variability) could vary by more than an order of magnitude HDOH, 2015). The variability was spatially random and unrelated to larger-scale trends.

Figure 4-6. Photomicrograph of Possible PCB-Infused Nugget of Silty Soil
Interpreted to represent a remnant drop of waste transformer oil that sank into the soil (HDOH, 2015).

This variability increases as the scale of measurement decreases. Microscopic evaluation identified what appear to be "fossilized" drops of PCB-infused transformer oil in soil from Study Site C (Figure 4-6; HDOH, 2015). Although not directly tested as part of the study, it is conceivable that the concentration of PCBs in the nuggets could approach the originally porosity of the soil following biodegradation of the mineral oil carrier, or several tens of percent.

Similar nugget effects for munitions, lead paint and other contaminants have been documented for soil (see ITRC, 2012). Figure 4-7 depicts a photomicrograph of arsenic-contaminated soil from Hawai‘i. Electron microprobe analysis of the soil indicates that arsenic is concentrated in micrometer-scale "nuggets" of iron hydroxide randomly dispersed within the soil. The concentration of arsenic within the iron hydroxide nuggets is orders of magnitude greater than in the surrounding soil matrix (Cutler et al., 2006, 2011).

Figure 4-7. Arsenic-Infused Nuggets of Iron-Hydroxide in Volcanic Soil
Soil impacted by spraying of arsenic-based pesticides (photo courtesy of William Cutler).

4.1.2 IMPLICATIONS OF RANDOM, SMALL-SCALE VARIABILITY

The implications of ubiquitous random contaminant concentration variability in soil at the scale of a traditional discrete sample are significant. Discrete sampling methods are based on the premise that an individual sample can be assumed to represent the immediately surrounding area and that variability between individual samples is predictable and reflective of larger-scale trends of interest:

The PCB level is assumed to be uniform within [a contamination zone/spill area] and zero outside it (USEPA, 1985;
To apply this [discrete sampling] method… [it must be assumed that] any sample located within the contaminated zone will identify the contamination (USEPA, 1987);
When there is little distance between points it is expected that there will be little variability (in contaminant concentrations) between points (USEPA, 1989b).

The mass of soil to be collected as a discrete sample only need meet the mass required by the laboratory for analysis, including quality control (default 100 grams per sample recommended; USEPA, 1987). The concept of "data quality" was then shifted to the laboratory with the main source of error presumed to be associated with analytical error.

As discussed in the HDOH field study reports, these critical and ultimately erroneous assumptions were not evaluated in sufficient detail in the field or in the laboratory prior to publication of these and other guidance documents. Decision making error based on the use of discrete sample data is high and even unavoidable in several critical stages of site investigation, including (HDOH 2015b):

  • Comparison of individual data points to soil action (or screening) levels;
  • Estimation of the lateral and vertical extent of contamination;
  • Preparation of isoconcentration maps;
  • Design of remedial actions for removal of contaminated soil;
  • Estimation of contaminant mass of in situ treatment;
  • Estimation of mean contaminant concentration for use in a risk assessment.

Comparison of individual, discrete sample points to risk-based action levels can be highly unreliable. As documented in the HDOH field study, it is inevitable that concentrations will at some point vary both above and below the target action level. This will result in a high risk of "false negatives" and a potential that contamination that could pose a significant risk to human health and the environment might go undetected (see HDOH 2015b). Indeed, this is the likely cause of large contaminated concentration variations for some co-located discrete samples, and "failed" confirmation samples when discrete soil data are used to guide remedial actions.

Both the HDOH Environmental Action Levels (EALs; HDOH 2016) as well as the USEPA Regional Screening Levels (RSLs; USEPA, 2014) are intended for comparison to the mean concentration of a contaminant within a defined, exposure or spill area. They are not intended for direct comparison to individual, discrete sample points. This was discussed in early risk assessment guidance but not fully appreciated in field investigation guidance being developed during the same time period (USEPA, 1992b):

For Superfund assessments, the concentration term (C) in the equation [of risk-based screening level models] is an estimate of the arithmetic average concentration for a contaminant based on a set of site sampling results [i.e. for an exposure area].

The unreliability of a single discrete soil sample to approximate mean contaminant concentrations for comparison to screening levels and decision making was similarly recognized but not fully appreciated in early risk assessment guidance (USEPA 1992):

Sampling data from Superfund sites have shown that data sets with fewer than 10 samples per exposure area provide poor estimates of the mean concentration.

This concern about unreliable data includes the use of small numbers of discrete soil samples to estimate the extent of chemical contamination above levels of potential concern.

Random, small-scale variability of contaminant concentrations in soil above and below an action level or geostatistical isoconcentration contour is expressed on maps by seemingly isolated "hot spots" and "cold spots" within a contaminated area (refer to HDOH 2015b). These "spots" are real only in the sense that they reflect the variability (i.e., "noise") of contaminant concentrations in the soil at the scale of the discrete sample tested.

Large clusters of discrete data points consistently above a target level might serve as gross indicators of larger-scale contaminant patterns of interest. Such conclusions should be verified by the designation of DUs and collection of Multi Increment sample data, however, as discussed in the next section.

The implications of random small-scale variability of contaminant distribution and concentrations in soil for investigation of contaminated sites can be summarized as follows:

  • Soil action (screening) levels apply to the mean concentration of a contaminant over a targeted area (e.g., spill area or exposure area), not to individual discrete points within that area (refer to HDOH 2016).
  • The objective of an environmental site investigation of soil is to determine if the mean concentration of a contaminant in a sufficiently large area (and volume) exceeds some critical threshold that could indicate a potential a risk to human health and the environment.
  • The appropriate area and volume of soil for decision making is determined as part of the Decision Unit designation process (e.g., spill area or exposure area DUs; see Section 3).
  • Determining the range of contaminant concentrations within a DU at some pre-specified small scale (e.g., mass of a typical laboratory subsample) is not practical, necessary, or relevant for the purposes of an Environmental Hazard Evaluation (see Section 13).
  • The mean concentration of contaminants of concern for these areas (and volumes) of soil can be most reliably estimated through the use of Multi Increment sampling methods.
The cause of decision error associated with the use of discrete sample data is ultimately simple – the sample mass collected and tested is too small to overcome random small-scale variability of contaminant concentrations in soil. This fact is both predicted and addressed by sampling theory and the use of Multi Increment sample data to characterize well-thought-out DUs.

4.1.3 USE OF SAMPLING THEORY AND MULTI INCREMENT SAMPLING TO IMPROVE SAMPLE REPRESENTATIVENESS

Sampling theory dictates that the representativeness of a sample is controlled by four primary factors (after Pitard, 1993, 2005, 2009; Minnitt et al., 2007; ITRC 2012; see also US Navy, 2015): 1) Random fluctuations in the distribution of the target analyte in soil ("distributional heterogeneity"), 2) Sample collection methods, 3) Sample processing methods and 4) Analytical error. Decision units and Multi Increment sampling methods are used to minimize and evaluate these potential sources of error. Field sampling and processing error, as well as laboratory subsampling error, are likely to far outweigh error attributable to the analytical method used to test subsamples of soil extracted from bulk samples.

Uncertainty associated with the first factor is referred to as "Fundamental Error." Although Fundamental Error can never be completely eliminated, its effect can be minimized by careful sampling design and processing of samples for analysis. The mass of soil necessary to represent a targeted area can be predicted by sampling theory. Factors include the range and shape of particle sizes present in the sample and the desired precision of the data (e.g., parts-per-hundred versus parts-per-billion).

As discussed in the next section, the estimated sample mass required is then collected from a large number of points within the targeted DU area. Each point represents an "increment," with individual increments combined to form a bulk "Multi Increment" sample. Bulk MI Samples are typically air dried and sieved at the laboratory to remove particles larger than 2 mm. The processed sample is then subsampled in the laboratory using a sectorial splitter or Multi Increment sampling in same manner as it was collected in the field to maintain representativeness, and this subsample is tested for target contaminants of concern. A modified approach using the collection of increments in methanol or freezing of individual increments is used for volatile organic compounds. Field and laboratory replicates are used to test the precision of the resulting data.