Department of Health Seal

TGM for the Implementation of the Hawai'i State Contingency Plan
Section 4.3
USE OF DISCRETE SAMPLES

4.3 USE OF DISCRETE SAMPLES

A "discrete sample" refers to the collection of a small mass of soil, typically 100-200g, from a single point within an area targeted for investigation. Discrete samples have traditionally been used to help identify the lateral and vertical extent of contamination. The use of discrete soil sample data is not recommended for final decision making purposes as part of an environmental investigation (HDOH, 2015,b; Brewer et al. 2016; see Subsection 4.1.2). Random, small-scale variability of contaminant distribution and concentration in soil limits the reliability of discrete sample data for estimating the extent of contamination that could pose an unacceptable risk to human health and the environment.

It is also important to note that the HDOH Environmental Action Levels for soil are not intended for direct comparison to individual, discrete sample data points (HDOH 2016; refer to Subsection 4.1 and Section 13) as well as the USEPA Regional Screening Levels (USEPA, 2014). Action/screening levels for direct-exposure, for example, assume random contact with soil throughout the DU over many years. Comparison to the mean action level in designated Exposure Area DUs is therefore appropriate (refer to Section 3; see also USEPA, 1987, 2013b). The concentration of a contaminant at any given discrete sample point within a DU, whether it be above or below an action or screening level, is not relevant to the overall risk posed by contamination for the DU as a whole (see also HDOH 2015b).

Existing discrete sample data and grids of discrete samples can, however, be useful for designation of DUs for a more intensive, Multi Increment sample investigation. For new projects, consider the collection of a large mass of soil from multiple locations around a sample collection point (Figure 4-25 A&B). Such “large-mass” discrete samples will help improve the representativeness of the resulting data for the associated grid point. For example, collect 1-2kg of soil (recommended MI sample mass, minimum 300g; Subsection 4.2.3) from multiple (e.g., 5-10+) points within a few feet of the grid point in order to reduce Fundamental Error and capture random, small-scale variability of contaminant concentrations over short distances (see Subsection 4.1.2). Individual masses of soil should be collected in a similar manner as described for MI increments, including proper shape, depth and mass (Subsection 4.2.5.2). Bulk samples to be screened in the field should be tested multiple times until a representative mean can be determined, for example through use of a portable XRF (Section 8.4.1). Samples submitted to a laboratory for testing should be processed and tested following standard MI procedures to ensure that representative data are obtained, including testing of a minimum 10g mass (Subsection 4.2.6). Note that the latter requirement could negate the cost-benefit of implementing a discrete sample grid approach to screen a site in comparison to the collection of MI samples from reasonably small DUs. If samples are not processed for testing then this limitation should be noted in the report and additional care taken in interpretation of the data.

Figure 4-25 A&B. Collection of large-mass discrete soil samples from multiple locations around a single sampling point in order to improve data representativeness (A: USGS 2016; B: see ERM 2008).

This approach reduces the susceptibility of traditional discrete soil samples to random error and improves the ability to identify larger-scale contaminant patterns of interest. Note that these types of samples are sometimes informally referred to as "composites" in USEPA and other field investigation guidance (e.g., USEPA 1989, USGS 2014, USGS 2016). Use of the term “composite” is discouraged for projects overseen by HDOH, however, due to potential confusion with more formal use of the term to indicate the intentional mixing of soil from what would otherwise be considered separate DUs (refer to Subsection 4.4.11).

Discrete soil sample data can in theory be used to estimate mean contaminant concentrations for a targeted DU area provided that samples are collected in a manner consistent with sampling theory (e.g., proper, size, shape, mass, etc.) and the data can be demonstrated to be reproducible. As discussed below, however, this is unlikely to be cost effective in comparison to the use of Multi Increment sample data to estimate mean contaminant concentrations.

4.3.1 INTERPRETATION AND PRESENTATION OF ISOCONTOUR MAPS

Isocontour maps (e.g., concentration, thickness, etc.) based on discrete sample data should not be used for decision making purposes without adjustment to reflect additional site knowledge and professional judgment. This is due to the unreliability of small-scale patterns and the reduced accuracy of isocontours based on traditional discrete soil (and sediment) sample data as discussed above (HDOH 2015b, Brewer et al. 2016). Specific errors often encountered in unadjusted, isocontour maps include:

  • Artificial "hot spots" and "cold spots" caused by random, small-scale variability of contaminant concentrations at the scale of a discrete sample;
  • Erroneous "zero" isocontours around the perimeter of contaminated areas due a lack of outward data points;
  • Inherent lack of precision of isocontour placement.

Unrecognized, these errors can lead to a false sense of precision in computer-generated isocontour maps and lead to erroneous decisions regarding the need to continue or halt site investigations or remedial actions (HDOH 2015b; see also Subsection 4.1). This includes calls for remediation of isolated "hot spots" based on single or small numbers of discrete samples and premature termination of site investigations or remedial actions due to false "cold spots" in the discrete sample data.

Isocontour maps should be adjusted to reflect site knowledge and professional judgment not reflected in computer-generated maps. Such adjustments are not possible in existing computer programs to the knowledge of HDOH and must be done by hand. Boundaries between apparent large-scale patterns should necessarily be dashed. Small-scale heterogeneity within larger-scale patterns generated by small numbers of discrete sample points should not be presented on final maps included in the report.

For example, Figure 4-26 depicts a nine-acre site formerly used for storing and mixing pesticides. The northern area of the site was known to be heavily contaminated with arsenic based on previous collection of both discrete and Multi Increment samples. The exact area of elevated arsenic was uncertain based on previous testing although the area of the former mixing shed was most suspect. No obvious signs of contamination were recognizable in the field.

A significant number of large-mass, discrete surface soil samples (0-6 inches) were collected from a 50-foot grid across the site (ERM 2008). Each discrete sample was collected from multiple points around each grid point in order to help address random, small-scale heterogeneity and increase data representativeness (see Figure 4-25b). Samples were analyzed using a portable XRF. A subset of samples was analyzed in a laboratory for comparison. As can be seen in the figure, the XRF helped to identify at least one large spill area of arsenic-contaminated soil in the northern part of the site. Smaller clusters of discrete samples with higher reported levels of arsenic might or might not be reflective of actual conditions in the field. False patterns of higher and lower levels of contamination can be produced by samples that are too small to capture and smooth out random heterogeneity of contaminant distribution in soil (see Subsection 4.1; HDOH 2015,b).

Three distinct areas of arsenic contamination are apparent in the figure (see Figure 4-26). The concentration of arsenic in the majority of discrete samples collected from Area A is below a screening level 20 mg/kg, with occasional "outliers" that exceed this value. Arsenic is randomly above 20 mg/kg in any given, discrete soil sample collected from Area B. Arsenic is above 20 mg/kg in the majority of discrete samples collected from Area C, with random "outliers" below this value.

Figure 4-26. Unadjusted Isoconcentration Map from Discrete Sample Arsenic Data at a Nine-Acre Former Pesticide Storage Site
Red-shaded areas denote total arsenic concentrations >20 mg/kg. Most contaminated area corresponds to former pesticide mixing area denoted by red circle on 1979 aerial (modified from ERM 2008). Three large-scale areas of arsenic distribution hypothesized (HDOH 2015b): A) Arsenic below 20 mg/kg in majority of discrete sample-size masses of soil; B) Arsenic above and below 20 mg/kg in any given, discrete soil sample and C) Arsenic above 20 mg/kg in majority of discrete sample masses of soil. Small-scale patterns are interpreted to be artifacts of random, small-scale heterogeneity and may or may not be reproducible (see HDOH 2015b).

Figure 4-27. Adjusted Arsenic Isoconcentration Map for a Former Pesticide Storage Site
The adjusted map more accurately reflects the resolution of arsenic distribution in soil across the site that can be reliably extracted from the discrete sample data.

As discussed below, such maps can subsequently be used to help designate Decision Units and carry out a more reliable and higher resolution Multi Increment sample characterization of the site. Preliminary maps such as these could also be used to carry out initial remediation actions, for example removal of soil from the heavily contaminated area, followed up with a DU-Multi Increment investigation to assess the need for additional actions. This assessment requires significant experience and professional judgment on the part of decision makers.

The appearance of seemingly isolated, "hot spots" and "cold spots" within larger-scale, distinct areas most likely reflect small-scale contaminant distribution that may or may not represent true areas of higher or lower contamination that can be mapped (see Subsection 4.1; HDOH 2015b). If grid points were moved over a few feet and new samples collected and analyzed, then a similar large-scale pattern would appear, but small-scale "hot spots" and "cold spots" within these areas would be located in different places. This type of field error is an artifact of the individual sample being too small to overcome and capture random, small-scale heterogeneity of the contaminant in the soil. Attempts to design remedial actions based on single samples or even small sets of discrete sample data is highly unreliable and is not recommended or acceptable for final decision making purposes.

Large-scale patterns reliably identified by grids of discrete soil samples can, however, be used in conjunction with other available information to designate DUs for the collection of Multi Increment samples. Figure 4-27 presents an adjusted map of arsenic distribution in soil that more accurately reflects the resolution of arsenic distribution across the site that can be extracted from the discrete sample data.

4.3.2 DESIGNATION OF DECISION UNITS

In spite of the limitations noted above, tight grids of discrete sample data utilizing field screening tools can provide useful screening level data to help identify large-scale areas of contamination, and help guide a more thorough DU-MIS investigation (refer to Subsection 4.2). Examples of field screening tools include portable X-Ray Fluorescence (XRF) instruments and immuno-assay tests. Field screening tools need to be reliable for the application employed, and those handling the tools for site investigations should have experience with their use. Additional information on use of field screening methods is provided in Section 8.

Continuing with the example presented above, Figure 4-28 depicts hypothetical DUs designated for the former industrial facility based on a combination of historical information, the results of the discrete soil sample study, proposed redevelopment for one-acre residential lots, and optimization of potential remedial actions (for example only; not included in original report).

Figure 4-28. Example DUs Designated for a Former Pesticide Storage Site
Based on a combination of historical information, the results of the discrete soil sample study, proposed redevelopment for one-acre, residential lots and optimization of potential remedial actions.

One-acre DUs are designated in the lower area of the site, where historical information and discrete sample data suggest minimal contamination (Area A in Figure 4-28). The DUs reflect hypothetical exposure areas for the planned residential redevelopment of the site and the lowest recommended "resolution" for site characterization (see Section 3.4). It is anticipated that remediation will not be required within this area. The DUs designated for Area B in Figure 4-28 are intentionally scaled smaller. This reflects the increased chance that some degree of remediation may be required for this area and a desire to increase the resolution of the data. This is done by reducing the sizes of DUs in order to optimize remediation and minimize potential removal of otherwise clean areas of soil that are inadvertently included with otherwise contaminated areas. This approach is also emphasized in Area C, where both historical information and discrete sample data verify the presence of significant contamination and the need for remedial actions. The use of small DU areas and volumes ensures an adequate resolution of data for preparation of the most cost-effective remedial action plan possible. Refer to Section 3.4 for additional information on DU designation for investigation and remedial purposes.

4.3.3 ESTIMATION OF MEAN CONTAMINANT CONCENTRATIONS IN RISK ASSESSMENTS

Discrete soil sample data have traditionally been used to estimate the mean contaminant concentration for targeted exposure areas in environmental site assessments and remedial actions (e.g., USEPA 1987, 2013b). The reliability of this approach was called into question by the HEER Office in 2006, due to the inability to verify the field representativeness of a single date set. Multi Increment sampling methods provide significant advantages for estimation of contaminant means in comparison to discrete sample data, including:

  • Consideration of sampling theory to determine the mass of soil required to collect a representative sample and method of sample collection and analysis;
  • Improved coverage of the targeted area (number of increments collected far greater than typical number of discrete samples);
  • Systematic and standardized approach for sample collection in order minimize bias in the field (e.g., size, shape and mass of individual increments);
  • Reduced number of samples required for analysis; general greater statistical precision of replicate samples (e.g., lower RSDs);
  • Samples processed and subsampled at laboratory in order to ensure representative data;
  • Replicate sample data provide additional information on field representativeness of samples and precision of data.

Nonetheless, mean contaminant concentrations for DUs can be estimated using discrete sample data provided that a systematic approach is used collect and process the samples in accordance with sampling theory, including sample shape and mass (refer to Subsection 4.1 and 4.2, and that the data can be demonstrated to be representative of actual field conditions through evaluation of replicate samples. Such quality control measures in the field are critical to the overall quality and representativeness of the resulting data, and go beyond simple consideration of the number of samples collected and the variance between individual data points. The HEER office should be contacted to discuss the collection and use of discrete sample in a risk assessment for a specific site.

An evaluation of the representativeness of a discrete sample data set should be carried in the same manner as done for Multi Increment samples (see Subsection 4.2). The accuracy of an estimated mean contaminant concentration for a DU is evaluated in terms of precision, or reproducibility, and bias, or systematic over or under estimation (ITRC 2012). This is illustrated in Figure 4-29.

Figure 4-29. Four Possible Relationships between Bias and Precision (after ITRC 2012)
The center point of the target is the true mean. The mean estimated from a single set of discrete samples, or a single Multi Increment sample, represents one point on the target. The gray area around the point represents uncertainty in the estimate

In order for an estimated mean to be accurate, the data set must be both unbiased and precise. Statistical analysis of a single set of discrete sample data only evaluates the precision of the estimated 95% UCL in terms of the variance of the data set provided and the statistical method used to evaluate the data. The number of discrete samples included in a data set can be increased in order to decrease the variance and provide an acceptable degree of precision.

Analytical precision only reflects one aspect of potential error, however. The complete precision of the data set in terms of field representativeness cannot be evaluated from a single set of discrete samples. This can only be evaluated through the collection and comparison of replicate sets of samples, as done for Multi Increment samples (See Subsection 4.2.7; see also ITRC 2012). Complete replicate sets of discrete samples are rarely, if ever, collected to test the quality of the estimated mean, however.

Past USEPA guidance has recommended that a minimum of 20 to 30 discrete samples are required to adequately represent contaminant heterogeneity within a targeted area (USEPA, 1992b):

Data sets with 20 to 30 samples provide fairly consistent estimates of the mean (i.e., there is a small difference between the sample mean and the 95 percent UCL).

Replicate Multi Increment data reviewed by the HEER Office, including a field study carried out in 2014 (HDOH 2015b, b) as well as statistical simulations included in the ITRC ISM document (ITRC 2012) suggest that error in terms of field representativeness could still be substantial when a relatively small number of discrete samples (e.g., < 30) are used to characterize a targeted DU (see also Subsection 4.2.2).

If discrete sampling is proposed for use at a site overseen by the HEER Office, specific approaches to address both precision and bias in the data should be discussed in the SAP (refer to Subsection 4.1). This should include a review of sample collection approaches in terms of sampling theory (e.g., number, size, shape, mass, etc.). Note that the mass of a discrete sample has been primarily dictated by the needs of the laboratory for analysis (default 100 grams per sample recommended; USEPA 1987), rather than sampling theory. This issue should likewise be addressed in the SAP.

"Outlier" discrete sample data points (e.g., comparatively very high concentrations) should not be omitted from a data set in order to force the data set to fit a geostatistical model (USEPA 1989, 2006b, g; see also HDOH 2015b); (Note that this conflicts with recommendations in the USEPA Pro UCL guidance; USEPA 2013b). The true mean is the concentration of the target contaminant that would be reported if the entire DU volume of soil could be tested as a single "sample." "Outliers" simply reflect a high distributional heterogeneity of contaminant concentrations in the soil at the scale a discrete sample and are an artifact of the sampling approach employed. The omission of supposed outlier data points from calculations distorts the representativeness of the data set and generates a technically unsupportable mean. For comparison, MIS increments that fall on small but obviously contaminated areas of a DU would not be excluded from the bulk Multi Increment sample. All discrete sample data must be included in an estimate of the mean, with the precision of the data set as a whole statistically evaluated. If additional sample points are required to improve precision then the samples should be collected using Multi Increment sampling approaches.