## Abstract

**ABSTRACT** Geochemical analysis of geological materials introduces errors at virtually every stage of sample preparation and analysis. Determining the actual analytical error (that error introduced during the analysis of prepared sub-samples of geological materials) is commonly difficult because many forms of analysis destroy the sub-sample. As a result, duplicate analysis cannot be undertaken to measure analytical error directly, and analytical error cannot be isolated from sub-sampling error. However, using replicate analyses of sub-samples of two different masses, and solving a system of three equations in three unknowns, the actual ‘analytical’ error can be deduced and distinguished from the sub-sampling error. This provides a means to estimate sub-sampling and analytical error magnitudes and to determine whether increasing sub-sample mass will result in an efficient reduction in overall error in geochemical analyses. It also provides a means to quantify sub-sampling error in reference materials so that they can be properly used in geochemical analysis to monitor and quantify analytical error.

- sampling error
- analytical error
- instrumental error
- digestion error
- geochemical analysis
- replicate
- duplicate

## INTRODUCTION

The various errors made during the sample preparation and analysis steps involved in determining element concentrations in geological materials are additive as variances (Stanley & Smee 2007; Stanley 2006*a*), and are important to document in a variety of geological applications. Duplicate samples of geological materials are commonly collected and analysed for this purpose, and allow determination (through the calculation of the differences of variances; Stanley & Smee 2007) of the sub-sampling errors that occur at each sampling step of a sample preparation protocol (e.g. initial sampling, sub-sampling after crushing or sieving, sub-sampling after pulverization). Knowing the magnitudes of these errors allows identification of how to most efficiently reduce the overall error, if necessary.

Unfortunately, duplicate samples collected in the last stage of sample preparation cannot generally be used to isolate the sub-sampling error (introduced during collection of the sub-sample of the material to be analysed) from subsequent analytical errors introduced during the dissolution and/or instrumental analysis of the element concentrations. This is because in some analytical procedures (e.g. fire assay), the sub-sample is destroyed during analysis, and thus cannot be analysed in replicate to estimate the overall analytical error. In other analytical procedures, the sub-sample is destroyed during digestion (e.g. fused disk X-ray fluorescence, total or partial reagent digestion followed by instrumental analysis such as AAS (atomic absorption spectroscopy), ICP-AES (Inductively coupled plasma atomic emission spectroscopy), and ICP-MS (Inductively coupled plasma mass spectrometry)). Thus, although the resulting solution (or glass) can be analysed in replicate to estimate the instrumental error, the sub-sample cannot be re-digested and so digestion errors cannot be estimated. As a result, the entire analytical error (digestion plus instrument error) remains un-quantified.

In contrast to the above analytical methods, pressed pellet X-ray fluorescence analysis and delayed neutron activation analysis (DNAA) can be used to estimate analytical error because the sub-sample is not destroyed during analysis and replicate measurements to quantify analytical error can be collected. Although this can be routinely done in pressed pellet X-ray fluorescence analysis, determining analytical error is not commonly undertaken in DNAA because re-analysis of the sub-sample can only take place after a sufficient time period to allow virtually complete decay of radio-isotopes that could interfere with the second replicate analysis. Depending on the element to be determined and sample matrix, this time period can be quite substantial, and this serves as an impediment to determination of the analytical errors in delayed neutron activation analysis.

## ESTIMATING ANALYTICAL ERROR

Given the above problem of measuring analytical error in geological materials, the following procedure is proposed that allows deduction of both the sub-sampling error of a geological material of given mass, and the analytical error for the geochemical analysis procedure employed. This method requires the availability of a sufficiently large mass of pulverized material (the larger the error, the larger the mass; cf. Appendix A), which is analysed a number of times using two different sub-sample masses. A proportion of the analyses are undertaken using small sub-samples, and the remaining analyses are undertaken using large sub-samples. The appropriate number of analyses undertaken for each sub-sample mass depends on the ratio of the large and small sub-sample masses (Appendix A), and the magnitudes of the sub-sampling errors in the material analysed and the analysis errors in the analytical method employed.

Once the replicate analysis results are obtained, estimates of the overall variances for the large and small sub-sample mass replicate determinations (*σ _{LO}*

^{2}and

*σ*

_{SO}^{2}) are calculated. Each of these variances are equal to the actual sub-sampling error variances introduced when the appropriate sub-sample masses were collected from pulp (

*σ*and

_{ls}^{2}*σ*

_{ss}^{2}) plus the analytical error introduced during the subsequent analysis (

*σ*

_{a}^{2}): (1)

and: (2) These large and small sub-sampling error variances (*σ _{ls}^{2}* and

*σ*

_{ss}^{2}) are unknown, but can be determined by first equating Equations 1 and 2 via the common (but unknown) analytical variance (

*σ*

_{a}^{2}): (3) Then, by substituting the fundamental relationship that exists between the large and small sub-sampling errors: (4) with known sample masses

*M*

_{L}and

*M*

_{S}(Stanley 2007), into Equation 3, we obtain: (5) Re-arranging yields: (6) and: (7)

Thus, using the estimates for *σ _{LO}*

^{2}and

*σ*

_{SO}^{2}and the small and large sub-sample masses (

*M*

_{L}and

*M*

_{S}), an estimate of the small sub-sampling error variance ( ) can be determined, and then substituted into Equation 1 to allow calculation of an estimate of the analytical error variance ( ). Finally, substitution of this analytical variance estimate into Equation 2 allows calculation of an estimate of the large sub-sampling error variance ( ). Equations that allow direct calculation of estimates of the three component variances directly can be derived, after some algebraic manipulation: (8) (9) and: (10)

The only assumption associated with the above procedure is that the analytical error is the same for large and small sub-sample masses (i.e. the *σ _{a}^{2}* in Equations 1 and 2 are equal; as required to obtain Equation 3). This assumption is likely not serious because all of the small and large sub-samples contain the same geological material, and thus must have the same (true) element concentration.

Essentially, the above calculation is equivalent to solving a set of three equations in three unknowns: (11) which can be expressed in matrix notation as: (12)

Thus, with appropriate matrix operations, the three unknown variances (*σ _{ls}^{2}*,

*σ*, and

_{ss}^{2}*σ*) can be estimated.

_{a}^{2}## EXAMPLE

To illustrate the use of the above strategy to decompose and quantify the errors observed in large and small replicate sub-samples into sub-sampling and analytical error components, an Acadia University lithogeochemical reference material (QUA-1) was repeatedly analysed for total Hg. The internal reference material was prepared from the Quadra Sand, an unconsolidated glacial outwash deposit exposed beneath Pt. Grey in Vancouver, British Columbia, Canada. The sand was pulverized to < 63 μm and homogenized in a dry ball mill before analysis. Replicate samples of two masses (100 mg and 400 mg) were analysed for total mercury using a modification of USEPA method 7473: thermal decomposition and gold-coated sand pre-concentration with atomic absorption spectroscopy (USEPA 1998). Analysis was achieved by heating the sample to 900 °C in the presence of a platinum catalyst, and collecting the evolved Hg^{0} gas on a gold sand trap (silica grains coated in gold) as an amalgam. This Hg^{0} was then liberated by heating the gold sand trap to 450–500 °C and measuring the re-evolved Hg^{0} using atomic absorption at a wavelength of 253.7 nm. All analysis was performed using an automated Nippon MA-2000 total mercury analyser.

Ten sub-samples of *c*. 400 mg and 42 sub-samples of *c*. 100 mg were analysed for total mercury. The sub-sample masses and mercury concentration results are presented in Table 1.

Note that these concentrations and all statistics in this communication are presented with what are clearly more significant digits than one would think normally is justified. This has been done to ensure that: (*i*) round-off errors are not introduced that could interfere with the calculations herein, (*ii*) sufficient digits are available to ensure the calculation of estimates of sub-sampling and analytical errors that are as accurate as possible, and (*iii*) the reader can confidently reproduce these calculations. Based on the analytical errors deduced herein (see below), Hg concentrations and sub-sample masses should probably be reported to the tenths of a ppb and tenths of a milligram, and all means and standard deviations should probably be reported to a hundredth of a ppb and a hundredth of a milligram (because these statistics are more stable than the individual analyses themselves).

Using the concentration data presented in Table 1, the resulting large and small sub-sample mean estimates ( and ) are 133.464 ppb and 136.043 ppb, respectively, and the corresponding weighted variance estimates (*s*_{LO}^{2} and *s*_{SO}) are 9.550 ppb^{2} (*s _{LO}* = 3.090 ppb) and 30.365 ppb

^{2}(

*s*= 5.510 ppb). Given these variances, the large and small sub-sample mean concentrations are not statistically significantly different at a confidence level of

_{SO}*α*= 0.05. The average sub-sample masses for these 10 and 42 replicates are 400.99 mg and 100.70 mg, respectively.

Substituting these variance estimates and sub-sample masses into Equation 8, the sub-sampling error variance for the small samples can be estimated (= 27.795 ppb^{2}; *σ _{ss}* = 5.272 ppb). Similar substitution of these variance estimates and sub-sample masses into Equation 9 provides an estimate of the sub-sampling error for the large samples (= 6.980 ppb

^{2};

*σ*= 2.642 ppb). Finally, substitution of these variance estimates and sub-sample masses into Equation 10 provides an estimate of the analytical error variance (= 2.570 ppb

_{ls}^{2};

*σ*= 1.603 ppb). The observed overall replicate variances and estimated sub-sampling variances are plotted using logarithmic and linear scales on Figures 1 and 2 to illustrate the additive relationship between these sub-sampling variances and the analytical error variance, and the inverse relationship between sub-sampling variance and sub-sample mass (Stanley 2007).

_{a}Knowledge of the large and small sub-sampling errors and the analytical error allows the geoscientist to determine where within the analysis, the most error is introduced. This is important because Stanley & Smee (2007) demonstrated that the most efficient method for reducing overall error in geochemical analysis (or any measurement, for that matter) involves reducing the largest component error.

In the above example, for the large samples, sub-sampling error introduced 73% of the overall variance observed in the analyses, and analytical error introduced the other 27%. In contrast, for the small samples, sub-sampling error introduced 92% of the overall variance, and analytical error introduced the other 8%. Thus, in both cases, if more precise mercury concentration measurements are necessary, use of larger sub-samples (larger than 400 mg and 100 mg, respectively) would be the most efficient way to reduce the overall variance in these analyses. Figure 1 or Equation 4 could thus be used to graphically or numerically identify the appropriate (larger) sample mass to achieve the desired sampling variance, and thus the resulting overall variance.

In fact, for sub-sample masses of 1089.1 mg, sub-sampling error would be reduced to such a level that it equals the estimated analytical error, and the overall replicate error variance observed in sub-samples of that mass (*s _{EO}^{2}*) would be 5.140 ppb

^{2}(= 2 ×

*s*;

_{a}^{2}*s*= 2.267 ppb). Thus, for sub-sample masses above

_{EO}*c*. 1 g, improvements (reductions) in the magnitude of analytical error would be more efficient in reducing the overall replicate variance in these analyses than increasing sub-sample mass (Stanley & Smee 2007).

Lastly, knowledge of the sampling error at a given mass allows calculation of a fundamental sampling parameter for that material (*Ψ* = *Mσ*^{2}, which is a constant for any sample mass of that material, and derived directly from Equation 4; Stanley 2007; an alternative to the sampling constant of Ingamells 1974*a*, *b*; Ingamells *et al*. 1972; Ingamells & Switzer 1973). In the example above, this sampling constant (*Ψ*) is 2.799 g×ppb^{2}. This exceeds an alternative fundamental sampling constant for a different Acadia University lithogeochemical reference material (CUL-1; a pyritic and carbonaceous shale from the Cultus Lake Formation, Chilliwack, British Columbia, Canada that was prepared and analysed ten times by the same procedure, and which exhibits a similar average total Hg concentration of 106.491 ppb; Table 2). The sampling constant (*Ψ*) for reference material CUL-1, calculated by subtracting the analytical error observed in reference material QUA-1 (*s _{a}^{2}*) from the observed variance in 100 mg sub-samples, is 1.106 g×ppb

^{2}. The lower sampling constant for this second reference material is not surprising because its average grain size (clay) and bulk hardness (soft because it is dominated by clay minerals) are much less than the average grain size (sand) and bulk hardness (hard because it is dominated quartz and feldspar) of the QUA-1 reference material. As a result, it can be expected to be more compositionally homogeneous, both before and after pulverization. Obviously, comparison of such sampling parameters can provide insight into the relative heterogeneity of geological materials (a larger

*Ψ*indicates that the material is more heterogeneous). Such sampling parameters can be provided by manufacturers of analytical reference materials to rigorously document the magnitude of heterogeneity of their materials in a manner that is independent of sample mass.

## CORRECTING NEGATIVE VARIANCE ESTIMATES

Note that the above procedure may fail if Equation 10 produces a negative result, as variances cannot be negative. This can occur if: (13) or: (14) from Equations 1 and 2. Thus, where the true magnitude of analytical error (*S _{a}*

^{2}) is small, if the observed error in small sub-samples (

*S*

_{SO}^{2}) is randomly over-estimated, and the observed error in large sub-samples is randomly under-estimated (

*S*

_{LO}^{2}; i.e. these variance estimates are imprecise), a negative estimate of analytical error can be obtained, in spite of the inverse relationship between sub-sample mass and sampling error (Equation 4). To rectify this problem, additional measurements of the large and small sub-samples will provide more precise estimates of the observed variances in large and small sub-samples, so that adequate precision in these statistics will be capable of resolving the small analytical error (and produce a positive variance estimate).

## DECOMPOSING DIGESTION AND INSTRUMENTAL ERROR

Note that the above procedure applies to scenarios where errors observed in replicate sub-samples can be sub-divided into sub-sampling and analysis errors. If errors can be sub-divided into sub-sampling, digestion, and instrumental errors, such as for fused-disk X-ray fluorescence, total or partial aqueous reagent digestion followed by instrumental analysis such as AAS, ICP-AES, and ICP-MS, then a slight adjustment to the procedure is required. First, the error variance observed in any set of replicates analysed by a method that involves sub-sampling, digestion and instrumental measurement components is: (15) where *σ _{O}*

^{2},

*σ*

_{i}^{2},

*σ*

_{d}^{2}, and

*σ*

_{s}^{2}are the observed, instrumental, digestion and sub-sampling variances. Because the numerical method described herein involves decomposing two component errors, use of this procedure with a three-component analytical scenario requires merely grouping the instrumental and digestion error variances together into an ‘analytical error’ variance (

*σ*

_{a′}^{2}): (16)

Then, proceeding as before using *σ _{a′}*

^{2}instead of

*σ*

_{a}^{2}in Equations 1 through 12 allows decomposition of

*σ*

_{a′}^{2}and

*σ*

_{s}^{2}. Replicate analysis of the digestion solution (in the case of total or partial aqueous reagent digestion followed by instrumental analysis such as AAS, ICP-OES, and ICP-MS), or of the glass (in the case of fused-disk X-ray fluorescence), allows estimation of the instrumental error (

*σ*

_{i}^{2}), and simple subtraction from the deduced analytical error (

*σ*

_{a′}^{2}), allows calculation of the digestion error (

*σ*

_{d}^{2}) by difference. As a result, identification of the largest error associated with such a multi-stage analysis can be achieved and suitable steps for reduction of such error can be most efficiently enacted (Stanley & Smee 2007).

## VALIDATION

The element and analytical procedure used in the above example to illustrate how sub-sampling and analytical errors can be decoupled and independently estimated (total Hg analysis via thermal decomposition and gold-coated sand pre-concentration with atomic absorption spectroscopy) was chosen because both solid and liquid samples can be analysed directly using this method. As a result, solid sub-samples (e.g. pulverized rocks) analysed by this method exhibit both sub-sampling and analytical errors, whereas liquid samples (*e.g*., waters) analysed by this method exhibit only analytical errors (on the presumption that the liquid is totally homogeneous and thus cannot exhibit sub-sampling errors). As a result, an independent estimate of analytical error in this analytical procedure can be determined from analyses of duplicate water samples, and this second estimate can be compared with the calculated estimate of analytical error derived from the above procedure to demonstrate that the deduced estimate of analytical error is correct. Such a comparison was undertaken using an analytical error estimate derived from the analysis of 10 replicate aliquots of a water standard prepared at Acadia University to exhibit the same concentration as the pulverized QUA-1 reference material described above (134.815 ppb; the mass-weighted mean of all replicate analyses of large and small sub-samples of QUA-1; Table 1).

The water analysis results are presented in Table 3, and exhibit an average Hg concentration of 134.823 ppb. The standard deviation of these concentrations is a direct estimate of analytical error of the technique (*σ _{AS}*), and is equal to 2.092 ppb ( = 4.378). This analytical error estimate is only slightly larger than the estimate derived from the numerical procedure described above (1.603 ppb; = 2.570). An appreciation of whether this difference is significant can be obtained by undertaking a formal inference test of equality of the estimates of the two variances. This requires: (

*i*) knowledge of how an observed variance is distributed for a given number of degrees of freedom (this distribution will apply to the analytical error variance observed in the water sub-samples (

*σ*

_{AS}^{2}), and the observed large and small estimation error variances of the solid sub-samples (

*σ*

_{LO}^{2}and

*σ*

_{SO}^{2}); presented in Appendix B), and (

*ii*) a formulae that propagates the error distributions in the observed large and small error variances of the solid sub-samples and the uniformly distributed round-off weighing errors on the sub-sample masses though Equation 10 into a distribution for the deduced analytical error (Appendix C). The estimates of analytical error, and the resulting distributions of these estimates are presented in Figure 3.

The inference test required to assess whether the two analytical error variance estimates are equal has a test statistic (*J*) of: (18)

Although the distribution of is known analytically (Equation B14), the distribution of (Equation C18) is not what one would expect from normally distributed data (cf. Appendix C) and is not known analytically, so the distribution of the test statistic was determined empirically using 25 000 randomly-generated realizations of *J* obtained via a Monte Carlo method.

Note that two estimates of *σ _{a}*

^{2}are available for use in Equation 18, and result in different test statistic magnitudes. One test statistic,

*J*

_{1}= 0.23 (where the analytical variance = 5.384 ppb

^{2}), occurs if

*σ*

_{a}^{2}is estimated by

*E*(

*Y*

_{1}), where the distribution of

*Y*

_{1}is defined by Equation C18. Alternatively, a second test statistic,

*J*

_{2}= −0.42 (where the analytical variance = 2.570 ppb

^{2}), occurs if

*σ*

_{a}^{2}is estimated by: (19) from Equation 10. These two estimates of are different because: (20) It is not clear which of these two estimates of

*σ*

_{a}^{2}, or their corresponding test statistic values, should be used in this application. As a result, both are presented in Figure 3, and both have been used in the inference test of equality of variances described above.

Figure 4 presents the two observed test statistics (*J*_{1} = 0.23 and *J _{2}* = −0.42) and the empirically derived distribution for these test statistic estimates (in grey). Comparison of these results indicates that the analytical error variance estimate derived using the water samples is not significantly different from either estimate of the analytical error derived from the powder replicate analyses. In both cases, the corresponding test statistics are between the two bounding critical values of −1.47 and 2.47 for

*α*= 0.025 on each tail.

As a result: (*i*) the analytical error deduction approach described herein represents an effective means to decouple sub-sampling and analytical error in a geochemical analysis, and (*ii*) thus allows estimation of both analytical error of the geochemical analysis procedure, and a fundamental sampling parameter that describes the overall heterogeneity of the solid material that was sampled. Furthermore, the assumption that analytical error is approximately the same for different sample masses does not appear to be ill-founded, at least for sub-sample masses differing by less than an order of magnitude.

## CONCLUSIONS

Analytical errors in geochemical analysis are difficult to quantify because the traditional method of estimation (replicate sub-sample analyses) is unable to decompose the observed errors into their sub-sampling and analytical error components. Herein, we present a simple mathematical formulation that, with one assumption (that analytical error is the same regardless of sub-sample size), can be used to obtain estimates of sampling error and analytical error. Results can be used to identify the appropriate strategy to reduce the observed errors, or to identify the most appropriate samples for use as standard reference materials to monitor analytical quality.

## Acknowledgments

This research was supported by a National Science and Engineering Research Council (NSERC) Discovery Grant # 217290 to the first author, a Canada Research Chair Grant # 203477, Foundation for Innovation Grant # 203477, and NSERC Discovery Grant # 341960 to the second author, and an internal Acadia University research grant to third author. The total Hg analyses presented herein were analysed in the biogeochemical laboratories of the K.C. Irving Environmental Science Centre at Acadia University.

## Appendix

### (A) APPROPRIATE NUMBERS OF LARGE AND SMALL REPLICATES FOR ANALYSIS

The ratio of the number of small sub-samples to the number of large sub-samples used in the error decomposition procedure described herein should be approximately equal to the ratio of the large sub-sample mass to the small sub-sample mass. This will ensure that the resulting large and small error variance estimates calculated in this procedure have approximately the same estimation error (standard error on the variance). The following argument justifies the above postulate.

Variances have standard errors that are directly proportional to the magnitude of the variance and inversely proportional to the square root of the number of replicates used to estimate the variance (Stanley 2003*a*): (A1)

As a result, it takes more samples to estimate a larger variance with a certain level of absolute confidence (standard error) than it does to estimate a smaller variance to the same level of absolute confidence. Assuming that sub-sampling variance predominates over analytical variance in a set of sub-samples analysed in the above application, larger sub-samples will have smaller variances than smaller sub-samples by a factor equal to the inverse ratio of the sub-sample masses (Stanley 2007). Thus, more of the small sub-sample replicates will have to be determined to achieve the same absolute estimation error on the small mass sub-sampling variance as the absolute estimation error on the large mass sub-sampling variance. The ratio of the number of small sub-sample replicates to large sub-sample replicated should thus be equal to the ratio of the small sub-sample mass to large sub-sample mass.

For example, the ratio of the average masses for sub-samples in the analysis of analytical error herein is approximately four ( ; the large sub-samples are approximately four times more massive than the small sub-samples). As a result, approximately four times as many small sub-samples need to be analysed as large sub-samples to obtain the same estimation error of these variances (standard error on the variances), a statistically desirable condition. In this application, the 400 mg sub-samples were analysed 10 times, and the 100 mg sub-samples were analysed 42 times, approximately reflecting the sub-sample mass ratio. The deduced sub-sampling variances for the large and small sub-sample masses are = 9.5503 ± 4.5020 ppb^{2} and 30.3609 ± 6.7065 ppb^{2}, respectively. The standard errors on these variances, although not exactly the same, are at the very least closer than they would be if an equal number of sub-samples of different masses were analysed. For example, if the first 10 small sub-samples were used in the above analysis instead of all 42, the standard error on the variance (24.5019 ppb^{2}) would be ± 11.5503 ppb^{2}, which is very different from the standard error on the variance of the 10 large sub-sample analyses (= 4.5020 ppb^{2}).

### (B) DISTRIBUTION OF THE VARIANCE OF A NORMALLY DISTRIBUTED VARIABLE

Let us assume that the observed Hg concentrations in Acadia University lithogeochemical reference material QUA-1 are normally distributed. This assumption is likely well justified for three reasons. Firstly, analytical error, a component of the observed error, is historically assumed to be normally distributed (e.g. Thompson-Howarth 1973, 1976, 1978; Howarth & Thompson 1976; Thompson 1973, 1982, 1988; Fletcher 1981; Stanley & Sinclair 1987; Stanley 2003*a*, *b*, 2006*a*, *b*, *c*; Garrett & Grunsky 2003), as the multitude of thankfully relatively small (not-necessarily normally distributed) errors that are introduced during geochemical analysis are mostly independent. Thus, the combination of these errors is likely normally distributed by the Central Limit Theorem (Spiegel 1975; Rice 2006).

Secondly, although sampling error is probably binomially distributed, the relative sampling errors in these analyses are small (4.068 % and 2.331 % for the small and large sub-samples, respectively). This means that the *n _{g}* magnitudes, the binomial distribution parameters equal to the number of equant grains in these binomial samples, are large (= 1.3975 × 10

^{10}and 0.4480 × 10

^{10}for the large and small samples, respectively; Stanley 2007). When an

*n*magnitude is large (say > 1000), the underlying binomial distribution approximates a normal distribution, so sampling error, the other component of the observed error, can be assumed to also be normally distributed.

_{g}Thirdly, because the sub-sampling and analytical errors are independent, the sum of these two errors is also normally distributed error, again via the Central Limit Theorem (Spiegel 1975; Rice 2006). Thus, the observed error on the Hg concentrations can be assumed to be normally distributed.

Now, let us define the quantity *T _{L}*, a function of

*X*'s (the Hg concentrations of the large sub-samples) that are independent and identically

_{L}*N*(

*μ*,

_{L}*σ*) distributed variates such that: (B1) This quantity is known to be distributed (Spiegel 1975; Rice 2006), which is equivalent to a gamma distribution with the parameters (

_{L}^{2}*n*− 1) and 1/2: (B2) where

_{L}*m*=

_{L}*n*− 1. If

_{L}*Y*∼Γ(α,λ), then the probability density function (pdf) of

*Y*is: (B3) If

*α*is a positive integer, then: (B4) and if

*α*is a positive half-integer (1/2, 3/2, 5/2, …), and

*κ*is the numerator of the fraction

*α*, then: (B5) where: (B6) and the double factorial = 1 for negative (

*κ*− 2) values. Thus, the pdf of

*T*is: (B7) Similarly, an analogous parameter,

_{L}*T*, defined by: (B8) is distributed as , where

_{S}*m*=

_{S}*n*− 1.

_{S}Now let us consider another quantity *T _{1}*, the statistical sample variance of

*X*'s: (B9)

_{iL}To determine the distribution of *T _{1}*, a change of variables from

*T*to

_{L}*T*can be used. This is achieved using the equation: (B10) where the derivative of

_{1}*T*with respect to

_{1}*T*is given by: (B11) Substituting Equations B7 and B11 into Equation B10, we obtain the pdf of the sample variance of

_{L}*X*'s: (B12) That is,

_{iL}*T*follows a gamma distribution with parameters . The distributions of and two analogous variables,

_{1}*T*

_{2}and

*T*, defined as: (B13) and: (B14) were used to determine the distributions of the observed Hg variances of the large and small mass pulverized samples ( and ) used in the calculation of the analytical error (Equation 10), and the observed Hg analytical variance of the solution samples ( ) presented in Figure 3.

_{as}### (C) DISTRIBUTION OF THE DEDUCED ANALYTICAL VARIANCE

In order to determine the distribution of the deduced analytical variance ( Equation 10), we need to know the distribution of each of the parameter estimators used to estimate *σ*_{a}^{2}. Thus, let us define *T*_{3} and *T*_{4} as the random variables with uniform (round-off error) distributions for measurement errors on the sub-sample masses: (C1) and: (C2) The random variable can be written in terms of *T*_{1}, *T*_{2}, *T*_{3}, and *T*_{4} as follows: (C3) To determine the distribution of , we first use a linear transformation from to , where: (C4) Since *T*_{i}'s are independent random variables, the joint pdf of is given by: (C5) By re-writing *T*_{i}'s in terms of *Y*_{i}'s: (C6) the joint pdf of *Y*_{i}'s can be obtained by using a change of variables → : (C7) where: (C8) and: (C9) Therefore: (C10) if (*y*_{1}, *y*_{2}, *y*_{3}, *y*_{4}) *ε* *R* and zero otherwise, where *R* is the domain of such that is positive. Thus, the pdf of *Y*_{1} can be obtained by integrating out *Y*_{2}, *Y*_{3}, and *Y*_{4} from : (C11) The limits on *Y*_{2}, *Y*_{3}, and *Y*_{4} can be determined from the prior information on . Since *Y*_{1} = is a variance, *Y*_{1} should be a non-negative number. That is: (C12) Furthermore, since , and and *T*_{4} must satisfy *T*_{3} − *T*_{4} > 0. Similarly, since and *T*_{1} and *T*_{2} should satisfy *T*_{1} − *T*_{2} < 0. As a result: (C13) In addition, *T*_{3} and *T*_{4} are themselves positive quantities (sub-sample masses). Therefore, *T*_{3} > 0, *T*_{4} > 0 and *T*_{3} - *T*_{4} > 0, altogether with Y_{2} < 0, implies that: (C14) In summary, the domain *R* can be written as: (C15) where: (C16) and: (C17) Unfortunately, the triple integral does not have an analytical solution. Thus, the pdf of *Y*_{1} = must be solved numerically using the domain defined by *R*: (C18) This distribution is presented in Figure 3, along with the distribution of , as defined in Appendix B, for comparison.

- © 2010 AAG/Geological Society of London