## Abstract

Thompson–Howarth error analysis is based on the assumption that measurement error is normally distributed. As a result, geochemical variables that are not normally distributed, such as those containing rare nuggets, cannot be statistically evaluated using Thompson–Howarth error analysis unless a modification to the procedure, involving use of the group root mean square (RMS) standard deviations, is implemented that makes it independent of the normality assumption. This modification prevents samples exhibiting a positively skewed error distribution, such as that produced by a ‘nugget effect’, from having their measurement errors underestimated (biased) using conventional Thompson–Howarth error analysis.

A consequence of the duplicate error analysis of ‘nuggety’ samples is that the maximum feasible relative error (of 141.2%; one standard deviation divided by the mean) may be observed in some samples. Maximum feasible relative errors for *n* replicates are equal to √*n***.** Maximum relative errors may be observed because Poisson probabilities of obtaining zero nuggets in one duplicate and one or several nuggets in another are not negligible, and thus very large grade disparities can be obtained in duplicate samples simply due to natural sampling variability. As a result, an abundance of samples exhibiting this maximum relative error is not necessarily an analytical or sample numbering error, but rather an expected consequence of sampling geological materials exhibiting large nugget effects, and may reflect relative measurement error that is larger than the maximum exhibited by duplicate samples. Consequently, if a large number of duplicate samples exhibit relative errors close to the maximum, it is likely that Thompson–Howarth error analysis of duplicate samples will underestimate the actual relative error in the data. As a result, replicate samples (where *n* >2) that have higher maximum relative error limits should be used to ensure that relative error estimates derived from such a Thompson–Howarth error analysis are not biased low (underestimated).

- Thompson–Howarth
- error analysis
- nugget effect
- duplicates
- normal
- distribution
- Poisson distribution
- root mean square
- equant grain model

## Introduction

Thompson–Howarth error analysis (Thompson & Howarth 1973, 1976*a*,*b*) has long been a standard technique to assess the precision (and thus quality) of geochemical data. Recent legislation concerning the reporting of mineral exploration results has required mining companies to undertake rigorous error analysis of rock assays on a routine basis in order to obtain accurate and precise estimates of the errors on the data used in mineral deposit reserve calculations. Furthermore, determination of measurement errors in grade control assays from low grade, high tonnage ore deposits is now recognized as extremely important because of the small profit margins associated with the mining of ores with concentrations close to the cut-off grade. As a result, extensive efforts are now being made by mining companies to ensure that bias is avoided in sampling and analysis, and that the magnitude of measurement error is known for virtually all assays from ore deposits. Unfortunately, the conventional application of Thompson–Howarth error analysis to assess the measurement errors associated with geochemical variables exhibiting a nugget effect (e.g. Au, Ag, Pt, diamonds) will produce significant bias because the assumption of normally distributed sampling errors that underlies the Thompson– Howarth technique is not upheld when the commodity of interest occurs in rare grains (nuggets) within an ore.

The standard implementation of Thompson–Howarth error analysis involves plotting the means of duplicate determinations against the absolute values of the differences between those determinations on a scatterplot. For a duplicate pair, the absolute difference between the duplicates is proportional to the standard deviation (*s*), with a constant of proportionality equal to the inverse of the square root of two (see Appendix):

As a result, calculation of the absolute difference, and subsequent multiplication by 1/√2 (=0.70711), is merely a short-cut to the calculation of the standard deviation of the duplicate pair.

Unfortunately, means and standard deviations calculated using duplicate determinations are rather poor estimates of the underlying true value, and so typically exhibit significant scatter when plotted on a Thompson–Howarth scatterplot. This large scatter makes it difficult to establish a quantitative linear relationship between concentration and error directly from the means and standard deviations of duplicate determinations. Thompson and Howarth's method addresses this problem by sorting the duplicate data in ascending order based on the duplicate means, and then calculating the average mean and median standard deviation (or absolute difference) for consecutive groups of 11 duplicate pairs. This ‘grouping’ approach provides average/median group statistics that exhibit far less scatter than the original individual duplicate statistics, and thus can more easily be used to establish a linear relationship describing error as a function of concentration using regression procedures.

Note that the choice of groups of 11 duplicate pairs is arbitrary, and in fact any number of pairs could be used. The more duplicates placed in a group, the better the estimate of mean and standard deviation will be, and the more stable the regression result will become. However, if too many duplicates per group are used, only a small number of group statistics will be available with which to calculate a regression function, degrading the quality of the regression result. As a result, the number of duplicate pairs per group used in Thompson and Howarth's approach should represent a trade-off between the precision of the data used in the regression (which is improved with larger number of duplicates per group), and the amount of data used in the regression (which is increased with a smaller number of duplicates per group).

Thompson and Howarth chose to calculate the group median of duplicate absolute differences because the median is less subject to the extraordinary influence of outlying values than the mean. Because of the functional relationship between the absolute difference and standard deviation, the same is true for standard deviations. Extreme results are likely to occur when duplicate determinations are used to estimate the standard deviations (or absolute differences) because the standard deviations and absolute differences are distributed as functions of the χ^{2} distribution: or

These distributions are positively skewed (Stanley 1998), and the skewness of a standard deviation (or absolute difference) distribution is a function of the number of degrees of freedom (*n*−1, where *n* is the number of replicates used to estimate the standard deviation; Fig. 1). Because of the positive skewness, anomalously high standard deviation (or absolute difference) estimates will be relatively common, whereas anomalously low standard deviation (or absolute difference) estimates will not occur because standard deviations must be positive. This could unduly influence (bias upwards) the resulting regression slope and intercept describing the relationship between concentration and measurement error.

Unfortunately, although the median of 11 absolute differences is more stable than the corresponding mean (it is not affected by anomalously high standard deviation estimates), use of the median absolute difference in the subsequent regression produces a bias. This is because the median of the distribution of standard deviations (or absolute differences; Equation 2) is significantly different from the mean because the distribution is skewed (in this case, the median is significantly less than the mean because the distribution is positively skewed, to varying degrees; Fig. 1). As a result, a correction for this bias is required. In their original implementation, Thompson and Howarth embedded this median-to-mean absolute difference bias correction (equal to the percentile of the mean of the distribution in Equation 2, for two degrees of freedom; equal to 1/0.67449 or 1.48260) and the proportionality constant that converts the absolute differences into the standard deviations (equal to 0.70711; Equation 1) together in a single correction factor (equal to 0.70711 × 1.4260=1.04836). As a result, to obtain the average standard deviations of groups of 11 duplicate pairs, the group median absolute differences must be multiplied by 1.04836. Stanley (1998) generalized this correction to accommodate standard deviations estimated with more than just duplicate samples (e.g. triplicates, quadruplicates). The resulting combined correction factor converges on 1/√(2) because as *n* increases (replicates of higher number are considered), the distribution of the standard deviation (Equation 2) becomes more symmetric and the median-to-mean correction factor component thus converges on unity.

## Application to geochemical variables exhibiting a nugget effect

The above implementation of Thompson–Howarth error analysis assumes that the underlying duplicate pairs of determinations are normally distributed. The variably skewed distributions of the standard deviation of duplicates (Equation 2) and replicates (Fig. 1) follow directly from this assumption. This allows proper correction of the bias imposed through use of the median standard deviation (or absolute difference) of groups of 11 into the mean standard deviation. As a result, the correction factor for duplicates (1.04836), and the corresponding correction factors for replicates, derive, at least in part, directly from this original assumption of normality (Stanley 2003).

Unfortunately, errors in geochemical variables are not always normally distributed. Rare elements, such as Au, Ag or Pt, and minerals, such as diamonds and other precious gems, are not likely to be normally distributed because these commodities occur in rare grains (nuggets). In fact, Clifton *et al*. (1969), Ingamells (1981) and Stanley (1998) have suggested that ‘nuggety’ ores are more likely to be Poisson distributed, and modelled the sampling errors for these ores as such using an equant grain model. This model assumes that the nuggets in a sample are all of the same size, shape and composition. The equant grain model chosen has an appropriate number of nuggets of an appropriate size (dictated by the concentration of the commodity of interest in the real ore) such that sampling of the equant grain model will produce the same relative error as that observed in the real ore, with its range of nugget sizes. In this way, the equant grain model is calibrated to the real ore.

Theoretical justification of the use of Poisson distributions to model the sampling errors in ores containing rare nuggets is based on the fact that the largest nuggets in an ore will contain by far the largest mass of the commodity of interest. For example, a nugget that is 100 μm in size will contain 1000 times more of the commodity of interest than a nugget that is 10 μm in size. As a result, these largest nuggets control the grade of the ore. Samples of the ore will either contain these large nuggets, or will miss them. As a result, the presence or absence of these largest nuggets in an ore will also largely control the sampling variance of the ore. In fact, because these largest nuggets contain so much more of the commodity of interest than even moderately sized nuggets, the sampling errors associated with ‘nuggety’ ores can be predicted using an equant grain model where all of the nuggets are the same (large) size.

As pointed out by Xie *et al*. (1990) and Royle (2002),⇓, examples of nugget grain size distributions with sufficient resolution to allow a quantitative comparison of the sampling characteristics of a real ore with a corresponding equant grain model are rare in the literature. Nevertheless, three nugget size distribution examples are presented below that allow such a comparison, and illustrate how well the Poisson distribution models the sampling distributions of real nugget-bearing ores that exhibit sampling error distributions with different levels of skewness.

A nugget grain size distribution from the Hollinger–McIntyre Au mine in Timmins, Ontario (Wood *et al*. 1986), is presented in Figure 2, along with the calibrated equant grain model for this ore, calculated according to the method described in Stanley (1998) and Clifton *et al*. (1969),⇓. This dataset represents an empirical example of how the sampling characteristics of a real sample, with a Au concentration of 8.9 g/tonne and a range of nugget sizes, can be described by a simple equant grain model with an average of 16.29 spherical nuggets per sample with diameters of 52.86 μm. The nugget size in this equant grain model is close to the size of the largest nugget observed in the real ore (it is 68% of its diameter), illustrating that the sampling characteristics of this ore are dominated by the largest nuggets. Thus, it is likely that the Poisson distribution describing the sampling error of the equant grain model will more than adequately describe the frequency distribution of the real ore.

A random sampling simulation was undertaken to test this hypothesis, using the nugget size distribution of the real ore, in a manner similar to that described by Royle (2002). One thousand samples were ‘collected’. For each, Poisson statistics were used to randomly identify how many nuggets of each nugget size were included in each sample, and the overall Au grades of these ‘simulated’ samples were then calculated (assuming spherical nuggets, pure Au with a density of 19.3 g/ml, and samples equal in size to the original ore sample containing the nugget size distribution presented in Fig. 2). The frequency distribution of the Au concentrations of these simulated samples was then determined, and is plotted in Figure 3. Analogously, simulated samples of the equant grain model were also ‘collected’. Because the nuggets in this model are all of the same size, this was a trivial exercise because thefrequency distribution of the number of equant nuggets, and thus the Au grade of the equant grain model samples, could be determined directly using Poisson probabilities. This frequency distribution is also plotted in Figure 3. Comparison of the real sample and equant grain model sample frequency distributions reveals a very close correspondence. A χ^{2} test for goodness of fit indicates that these distributions are identical at the 5% confidence level, as the critical value of 60.48 (for 44 degrees of freedom) is not exceeded by the corresponding test statistic of 57.06. As a result, for this example, not only does the equant grain model exhibit the same average Au grade and relative sampling error (and thus the same standard deviation), but the frequency distribution of the real ore is virtually identical to the corresponding Poisson distribution.

A second, analogous example comparison of the sampling error characteristics of a real nugget-bearing sample and its corresponding equant grain model is presented in Figures 4 and 5. This geochemical sample (‘R6’) was originally investigated by Xie *et al*. (1990), and exhibits a gold concentration of 2.92 g/tonne. The corresponding equant grain model for this sample contains 9.35 spherical, pure Au nuggets with diameters of 51.68 μm, calculated according to the method described in Stanley (1998) and Clifton *et al*. (1969),⇓. Again, the diameter of the equant nugget is a large proportion of the largest grain diameter (74%), and so it is likely that the equant grain model will do a faithful job of reproducing the sampling characteristics of this geochemical sample.

An identical set of 1000 sampling simulations were ‘collected’ from this geochemical sample to compare the resulting sampling distribution with the corresponding Poisson distribution of the associated equant grain model. Again, a very high correspondence between the two distributions exists (Fig. 5). A χ^{2} test for goodness of fit confirms the equivalence of these distributions at the 5% confidence level, as the critical value of 44.99 (for 31 degrees of freedom) is greater than the corresponding test statistic of 41.88.

A third example, also derived from Xie *et al*. (1990), is presented in Figures 6 and 7. This geochemical sample (‘R3’) has a very large number of small Au nuggets, making it impossible to simulate its sampling characteristics directly because of numerical computation problems resulting from attempting to calculate Poisson probabilities of obtaining more than 200 nuggets. As a result, the frequency distribution of this sample (Fig. 6) has been halved (simulating the collection of a sample of half the mass) so that a comparison of the sampling characteristics of this material and its associated equant grain model can be undertaken. This geochemical sample has a much lower Au concentration of 44 ppb, and its associated equant grain model contains only 1.78 nuggets of 44.51 μm diameter (also calculated using the equations in Clifton *et al*. (1969) and Stanley (1998),⇓). Again, the diameter of the equant nugget is a large proportion of the largest grain (81%), so again, it is likely that this equant grain model will do a faithful job of reproducing the sampling characteristics of this geochemical sample.

As before, a set of 1000 sampling simulations were ‘collected’ from this geochemical sample to compare the resulting sampling distribution with the corresponding Poisson distribution of the associated equant grain model. Again, a high correspondence between the two distributions exists (Fig. 7). Unfortunately, a χ^{2} test for goodness of fit rejects the hypothesis that the simulated sampling distribution is Poisson distributed at the 5% confidence level, as the critical value of 15.51 (for seven degrees of freedom) is less than the corresponding test statistic of 169.90. However, because the number of equant nuggets in this equant grain model is low, the Poisson probability obtaining no equant nuggets (and thus a Au concentration of ‘zero’) in a sample from a model with a mean of 1.78 nuggets is 16.8%. In contrast, because the real geochemical sample contains so many small nuggets, it is virtually impossible for any simulated sample from the real geological material to have no nuggets, and thus there will be no simulated samples with ‘zero’ concentration. As a result, significant misfit of these two distributions can be expected in the first class interval of the histogram (Fig. 7), and the lower bound of this class interval has been adjusted upwards to address this limitation. Nevertheless, the observed χ^{2} goodness of fit test does fail, largely because this adjustment cannot completely accommodate the misfit at low concentrations.

Clearly, these three examples illustrate that a simple equant grain model does a good job of simulating the sampling characteristics of a far more complicated real ore. These examples consider nugget size distributions with a range of numbers of equant grains (i.e. Poisson means of 16.29, 9.35 and 1.78 nuggets per sample), relative errors (24.78%, 32.70% and 74.87%) and skewnesses (0.248, 0.327 and 0.74). Thus, these three case histories represent a reasonable cross-section of possible Poisson distributions that might result from the collection of samples of ‘nuggety’ geological materials.

As a result, sampling errors derived from material containing rare nuggets can be expected to exhibit at least an approximate Poisson distributional form. Thompson–Howarth error analysis of replicate samples from a number of Au-bearing ores by Stanley & Smee (2007), conducted using the appropriate procedures described in this paper for ‘nuggety’ samples, has revealed that sampling errors associated with the initial collection of a sample from several gold deposits are by far the largest component error contributing to the overall measurement error in the samples. As a result, errors introduced during preparation and analysis are both virtually negligible, and will not control the distributional form of the total measurement error. Consequently, like sampling errors, the overall measurement errors in samples from ‘nuggety’ material are likely to be Poisson distributed. More importantly, samples containing nuggets cannot be assumed to have normally distributed total measurement errors.

Unfortunately, Poisson distributions have significantly different forms from the normal distribution, as they exhibit a range of skewnesses and kurtoses, depending on their mean. Examples of six Poisson distributions depicting some of the variations that can occur are presented in Figure 8. These illustrate that as the mean of a Poisson variable increases, its distribution becomes more symmetric. In fact, because the skewness of a Poisson variable with mean λ is 1/√λ, and the kurtosis of a Poisson variable is 3 + 1/√λ (Spiegel 1975), as the mean increases, the skewness and kurtosis of a Poisson variable converge on 0 and 3, respectively. These are exactly the values of the skewness and kurtosis for a normal distribution, illustrating that as the mean of a Poisson variable increases, the distribution of the variable converges on a normal distribution (Spiegel 1975). Unfortunately, this correspondence between a Poisson and normal distribution applies only to large means. Given that many nugget-bearing samples exhibit low grades (and thus contain small numbers of nuggets, e.g. Au, Ag, Pt, diamonds), the Poisson distribution that applies is likely highly skewed and thus distinctly non-normal. As a result, application of conventional Thompson–Howarth error analysis to geochemical variables with underlying Poisson distributions will not generally be valid, because the correction factor used to convert the median of 11 absolute differences into the mean standard deviation is based on the assumption that the sampling errors are normally distributed.

Consequently, the *medians* of 11 standard deviations (or absolute differences) cannot be used in Thompson–Howarth error analysis when geochemical variables with Poisson errors are under evaluation. Instead, the root mean square (RMS) ‘means’ of the 11 standard deviations (or absolute differences) must be calculated. This is because it is the variances, and not the standard deviations, that are additive, and so the RMS standard deviations are unbiased estimates of measurement error. Unfortunately, although unbiased, the RMS standard deviations (or absolute differences) will be subject to instabilities caused by the presence of outlying standard deviations (or absolute differences) in a way similar to that described above for normally distributed errors. Furthermore, the frequency of occurrence of these outlying standard deviation (or absolute difference) estimates is likely to be higher than that for normally distributed sampling errors because the underlying Poisson error distribution is already positively skewed.

The instability created by use of the RMS standard deviation (or absolute difference) is a ‘necessary evil’ because there is no way to correct for the bias imposed through use of the median standard deviation of a Poisson distributed variable. This is because the difference between the mean and median of any distribution is a function of a distribution's skewness (the larger the skewness, the larger the difference). The skewness of a Poisson distribution is a function of its sole parameter (λ, the mean, as described above). Thus, in order to understand how large this skewness is, and thus how large the correction factor must be to convert the median standard deviation into the mean standard deviation, the value of λ (the mean number of nuggets, and the square of the standard deviation on the number of nuggets, for that Poisson distribution) must be known. Unfortunately, because the standard deviation of the Poisson distribution is what we seek to find, the appropriate correction factor cannot be determined because it is a function of the unknown parameter (λ) that is to be estimated. Nevertheless, because the Poisson distribution is positively skewed, the Poisson median will be a smaller proportion of the Poisson mean than the analogous statistics for a distribution that is unskewed, such as the normal distribution. Thus, the magnitude of the unknown median-to-mean correction factor is likely to be larger than that used in a conventional Thompson–Howarth analysis (i.e. >1/0.67449=1.48260).

An example of the rather substantial bias imposed by the use of the median of 11 standard deviations in a Thompson– Howarth analysis of duplicate Au concentrations from a low sulphidation epithermal Au deposit that exhibits a nugget effect is presented in Figure 9. Using the median of 11 standard deviations, a regression slope of *c.* 14% median error was obtained. This value is approximately one-third of the unbiased relative error value obtained when the RMS of 11 standard deviations is used (*c.* 42%). If a conventional Thompson– Howarth error analysis were undertaken, with no accommodation for the Poisson-distributed errors, the median-to-mean correction factor would only increase the relative error estimate derived from the median of 11 standard deviations from *c*. 14% to *c.* 21%. This is approximately half of the true, unbiased relative error determined using the RMS of 11 standard deviations. Thus, the observed bias is consistent with the observation that the unknown median-to-mean correction factor for duplicate samples should be >1.48260, above. Clearly, significant bias in the median standard deviation (or absolute difference) can be expected to exist when the underlying error distribution is not normally distributed, and modification of Thompson–Howarth's error analysis approach is necessary to accommodate variables whose errors are not normally distributed.

In the above case history, substantially outlying mean group standard deviations do not exist, and the correlations between the RMS group standard deviations and the mean group means are actually higher than the median group standard deviations and mean group means. As a result, use of the RMS group standard deviation did not cause regression estimation problems, at least in this case. However, it should be noted that not all replicate data sets are likely to behave in this manner, and that ‘safety in numbers’ may be the only way to ensure that outliers do not cause inaccurate estimates of measurement error using this modification of Thompson and Howarth's error analysis procedure. As a result, to address the instability created through use of the RMS group standard deviation, it is probably wise to ensure that any Thompson– Howarth error analysis involving a nugget-borne commodity of interest be undertaken using a large number of replicate samples. In this way, any outlying RMS for a group of 11 standard deviations (derived using an outlying standard deviation estimate) will not exert undue influence on the overall estimate of error, and an acceptable regression result can be obtained.

## Maximum relative error in replicate analyses

Another consequence of the application of Thompson– Howarth error analysis to samples affected by a nugget effect is that very large differences in individual duplicate pairs may exist. These are caused when one sample contains one (or several) nugget(s), and thus exhibits a high concentration, and the next does not (and thus exhibits a concentration of zero, or a concentration close to it, at least relative to the high concentration). Although many geoscientists may conclude that the above scenario is rare, obtaining no and some nuggets in a pair of duplicate samples is actually not a rare event at all.

For example, consider the Au nugget size distribution from geochemical sample ‘R3’ (Xie *et al*. 1990; Fig. 6). This sample represents the ‘nuggetiest’ material of the three case histories considered, above, and has the lowest mean number of equant nuggets (1.78), and the highest relative error (74.87%), even though it has the smallest equant nugget spherical diameter (44.51 μm). From Poisson probabilities, samples of the equant grain model for this material will contain no nuggets (and thus a virtual concentration of ‘zero’) a significant percentage of the time (16.8%), and at the same time, samples may contain at least five nuggets 3.5% of the time (corresponding to Au grades of at least 0.25 g/tonne Au). As a result, there is a very real possibility that one duplicate sample may report no nuggets and a second duplicate sample will report more than five nuggets from the same material (the probability of this happening is 16.8% × 3.5%=0.6% of the time). Although this probability is small in relative terms, the very large number of duplicates collected as part of a quality assurance/quality control (QA/QC) effort during reserve estimation and mining grade control programmes means that a non-trivial number of highly disparate concentrations can be obtained in duplicate samples of ‘nuggety’ material. Clearly, the chance of collecting no nuggets from our equant grain model can be significant.

In contrast, in samples of the real geological material, it is very unlikely that no nuggets will be collected, because a large number of nuggets exist in the smallest size fractions (<10 μm). However, because these size fractions are small, and thus the mass of Au within these size fractions represent only a small component of the total amount of Au in this sample (4% for the smallest size fraction), even if an average number of nuggets (77) is collected from the smallest size fraction, these nuggets will not significantly increase the grade of a sample above a ‘zero’ concentration (i.e. the 77 nuggets from the smallest size fraction contribute less than 2 ppb to the sample concentration). As a result, even in real geochemical samples, it is not unlikely that duplicate samples that exhibit high concentrations in one duplicate, and very low (negligible) concentrations in the other, will be observed.

Obviously, when one duplicate sample reports a very low concentration because it lacks significant numbers of large nuggets, and the other duplicate sample contains at least one large nugget, a very large relative error will result. These relative errors have a maximum limit. As a result, on a Thompson– Howarth scatterplot, the locations at which duplicate pair means and standard deviations plot are significantly constrained.

For example, consider a pair of Au measurements consisting of an initial concentration equal to *x* (containing one nugget) and a duplicate concentration equal to 0 (containing no nuggets). For this pair of concentrations, the mean concentration equals *x*/2 and the standard deviation equals *x*/√2 (from Equation 1). As a result, the relative error for this set of duplicates, regardless of the magnitude of the concentration *x*, will be: (or 141.21%). If the second duplicate concentration is not zero, but slightly larger (e.g. equal to the detection limit of the analytical method; say *ϵ*; either because one or more small nuggets were collected or due to analytical error), then the calculated mean concentration will be slightly larger than *x*/2 [=(*x*+ϵ)/2], but the standard deviation will be slightly smaller than *x*/√2 [= (*x* − ϵ)/√2] (from Equation 1). As a result, the relative error for this alternative scenario will be slightly less than √2 [=√2(*x* − ϵ)/(*x* + ϵ)]. A scenario where the initial and duplicate concentrations equal *x* and 0 (or *vice versa*) thus represents the limiting case that produces a maximum relative error.

The above observation indicates that there will be an upper limit to where duplicate samples can plot on a Thompson– Howarth scatterplot. This limit is defined by a line with a slope of 1.41421 that passes through the origin, and duplicates where one sample contains one (or more) nugget(s) and the other does not will plot on (or close to) this line. Figure 10 illustrates this effect in duplicate samples from a rather ‘nuggety’ anonymous saddle reef gold deposit. Several samples plot along a line with slope of 1.41421, and no sample plots above this line. The seven duplicate samples shown as triangles in Figure 10 are presented in order of increasing mean in Table 1, along with their concentrations, means, standard deviations and relative errors. For each duplicate pair, the lower concentration has been divided into the higher concentration (=*ϵ*/*x*) to demonstrate how close the concentrations of these duplicate pairs approximate *x* and 0. The *ϵ*/*x* values are all very close to 0 in all cases, illustrating that the concentrations of these duplicate pairs are essentially *x* and 0 (or vice versa), and thus exhibit relative errors close to the theoretical maximum.

Geochemists should be aware that duplicate geochemical analyses of materials that exhibit a nugget effect may commonly reach this maximum relative error limit. As a result, duplicate samples plotted on a Thompson–Howarth scatterplot will be constrained by limiting lines on the scatterplot that pass through the origin (with slopes of 0 and 1.41421). An abundance of duplicate samples plotting along a line with slope of 1.41421 do not necessarily represent some kind of mathematical curiosity or an analytical blunder (i.e.non-random error) within the data, but rather are a natural and expected consequence of ‘nuggety’ samples. Highly disparate duplicate samples are likely to exist in QA/QC datasets from Au deposits, and they should not be excised from the dataset unless some reason, other than their concentration disparity, can be identified that casts suspicion on the quality of the analyses. Examples of such situations are: (1) when a duplicate sample has the same sample number as another sample, suggesting that the duplicate sample has had its sample number transposed; or (2) when a sample with a very high concentration was analysed immediately before a duplicate sample with an anomalously high concentration, suggesting that cross-contamination from the very high grade sample is a distinct possibility.

However, because the lack of correspondence of the duplicate concentrations may have a perfectly reasonable explanation for occurring (i.e. the nugget effect), it cannot be assumed that duplicates with large concentration differences represent blunders. As a result, these disparate duplicates cannot be ignored in a QA/QC program. In fact, the non-scientific assumption that highly disparate replicate results are blunders violates Le Chauvenet's Principle (Meyer 1975), a long-standing, but under-appreciated maxim of scientific philosophy that states that one must have two independent reasons to omit data from consideration. Furthermore, any outlying observation without such additional cause for omission that exhibits a probability of occurrence less than 1/2*m* should be retained with the data (where *m* is the number of observations), and only be flagged as ‘suspect’. As a result, although a set of duplicate analyses may be very different, justification for discarding this duplicate pair from a QA/QC programme is not generally scientifically warranted unless extenuating circumstances (other reasons) suggest that there are problems with the analyses.

It should be noted that maximum, limiting relative errors also exist for triplicates, quadruplicates, quintuplicates, etc.; in fact they exist for any number of replicates. These limiting values are different for each number of replicates, and can be derived using a similar line of reasoning as that for duplicates. Obviously, as with the duplicates, the maximum relative error will occur when the observed results are either 0 or *x*. For triplicates, two scenarios exist: (0, 0, *x*) and (0, *x*, *x*) [(0, 0, 0) and (*x*, *x*, *x*) have no variance and thus cannot produce a maximum relative error]. The means for these two scenarios are *x*/3 and 2*x*/3, respectively, but the standard deviations are identical, equal to: and Because the standard deviations are equal, the largest relative error must occur in the scenario with the smaller mean (0, 0, *x*) (=*x*/3). The relative errors for these two scenarios are thus: and The square root of three (1.73205) is thus the maximum relative error that can be observed in triplicate data.

A similar stratagem can be undertaken to determine the maximum relative error for quadruplicate data, where three possible scenarios exist: (0, 0, 0, *x*), (0, 0, *x*, *x*) and (0, *x*, *x*, *x*). The means of these quadruplicate scenarios are *x*/4, 2*x*/4 and 3*x*/4, respectively. Unfortunately, in this case, the standard deviations are not all equal: and Nevertheless, the relative errors for these three scenarios are: and and the scenario with the smallest mean produces the largest relative error, in this case equal to two. Maximum relative errors can be derived in like manner for any number of replicates (not shown). These illustrate that: (1) the minimum mean occurs when only one replicate exhibits a concentration of *x* and all others exhibit a concentration of 0; (2) the maximum standard deviation occurs when the replicates exhibit an equal number of 0 and *x* concentrations; but (3) that the maximum relative error occurs when only one replicate exhibits a concentration of *x*. In all cases, the maximum relative error for a set of replicate analyses is equal to the square root of the number of replicates (=√*n*). As a result, although maximum relative errors exist for any number of replicate analyses, the magnitudes of these limiting values differ, and are a function of the number of replicates considered.

From a practical standpoint, if very large measurement errors are estimated using duplicate samples (such as at the Au deposit whose QA/QC data are presented in Fig. 10), such that a significant number of duplicates exhibit relative errors close to the maximum limit (and suggesting that the true relative error is actually larger than this limit), then replicate samples (with larger *n*) should be used in Thompson–Howarth error analysis so that the maximum possible relative error is substantially greater than the true relative error for that data set. This will ensure that the true relative error will not be underestimated by use of statistics that are constrained to a range that does not allow accurate estimates of the relative error.

## Conclusions

Conventional Thompson–Howarth error analysis is invalid when applied to samples exhibiting a nugget effect, and thus should not be employed when evaluating Au, Ag, Pt or diamond assays, which have errors that appear to be Poisson-distributed, or any other variable that has errors that are not normally distributed. Instead, a modification to Thompson– Howarth's original approach is necessary to avoid bias. This modification involves calculating the root mean squares (instead of the medians) of 11 absolute differences directly, and then dividing these results by √2 to convert these values into means of 11 standard deviations. In this way, the assumption of normally distributed errors is avoided, and an unbiased estimate of the relationship between measurement error and concentration can be deduced. Failure to follow this alternative method to Thompson–Howarth replicate analysis, when dealing with data subject to non-normal errors exhibiting positive skewness (e.g. due to the nugget effect), will result in a significant underestimation of measurement error.

A second consequence of the application of Thompson– Howarth error analysis on duplicate samples exhibiting a nugget effect is that an upper limit exists on the relative error observed in any duplicate (or replicate) pair. In very ‘nuggety’ samples, this upper limit has a high likelihood of being observed, creating an abundance of replicate determinations that plot along lines corresponding to these limiting relative errors. Samples that have non-Poisson-distributed errors are unlikely to report concentrations that vary from nil to some high concentration (a scenario which can be produced in ‘nuggety’ duplicates when nuggets are either included or excluded in duplicate samples). Thus, only ‘nuggety’ duplicate samples will typically exhibit the maximum theoretical relative error of 1.4121% on Thompson– Howarth error plots. If a significant number of duplicate samples exhibit relative errors close to the upper theoretical limit, it is possible that these duplicate samples underestimate the true measurement error. As a result, measurement errors calculated with duplicate samples may underestimate the true relative error if it is large, and use of a larger number of replicate determinations (which have higher maximum relative error limits) will be necessary to ensure that an accurate estimate of relative error is obtained.

## Acknowledgements

This paper was supported by a Canadian NSERC Discovery Grant, and a financial stipend and logistical support from CRC-LEME (Perth, Western Australia). Two anonymous mining companies kindly provided duplicate assays to illustrate several of the features of Thompson–Howarth error analysis applied to samples exhibiting a nugget effect, and the encouragement, interest and assistance rendered by their staff are greatly appreciated. This paper also benefited from helpful reviews by Dr Robert Garrett, of the Geological Survey of Canada, and an anonymous reviewer.

- © 2006 AAG/The Geological Society of London