The IMSL_ANOVA1 function analyzes a one-way classification model.

This routine requires an IDL Advanced Math and Stats license. For more information, contact your sales or technical support representative.

The IMSL_ANOVA1 function performs an analysis of variance of responses from a one-way classification design. The model is:

yij = μi + eiji = 1, 2, ..., k;   j = 1, 2, ..., ni

where the observed value yij constitutes the j-th response in the i-th group, μi denotes the population mean for the i-th group, and the eij arguments are errors that are identically and independently distributed normal with mean 0 and variance σ2. The IMSL_ANOVA1 function requires the yij observed responses as input into a single vector y with responses in each group occupying contiguous locations. The analysis of variance table is computed along with the group sample means and standard deviations. A discussion of formulas and interpretations for the one-way analysis of variance problem appears in most statistics texts, e.g., Snedecor and Cochran (1967, Chapter 10).

The IMSL_ANOVA1 function computes simultaneous confidence intervals on all:

pairwise comparisons of k means μ1, μ2, ..., μk in the one-way analysis of variance model. Any of several methods can be chosen. A good review of these methods is given by Stoline (1981). The methods also are discussed in many statistics texts, e.g., Kirk (1982, pp. 114–127).

Let s2 be the estimated variance of a single observation. Let n be the degrees of freedom associated with s2. Let:

The methods are summarized as follows:

Tukey Method: The Tukey method gives the narrowest simultaneous confidence intervals for all pairwise differences of means mi – mj in balanced

(n1 = n2 = ... nk = n) one-way designs. The method is exact and uses the Studentized range distribution. The formula for the difference μ i – μj is given by the following:

where is the (1 – a ) 100 percentage point of the Studentized range distribution with parameters k and n.

Tukey-Kramer Method: The Tukey-Kramer method is an approximate extension of the Tukey method for the unbalanced case. (The method simplifies to the Tukey method for the balanced case.) The method always produces confidence intervals narrower than the Dunn-Sidak and Bonferroni methods. Hayter (1984) proved that the method is conservative, i.e., the method guarantees a confidence (1 – a) 100. Hayter's proof gave further support to earlier recommendations for its use (Stoline 1981). (Methods that are currently better are restricted to special cases and only offer improvement in severely unbalanced cases; see, for example, Spurrier and Isham 1985.) The formula for the difference μi – μj is given by the following:

Dunn-Sidák Method: The Dunn-Sidak method is a conservative method. The method gives wider intervals than the Tukey-Kramer method. (For large n and small a and k, the difference is only slight.) The method is slightly better than the Bonferroni method and is based on an improved Bonferroni (multiplicative) inequality (Miller 1980, pp. 101, 254–255). The method uses the t distribution (see IMSL_TCDF. The formula for the difference μi – μj is given by the following:

where tf;v is the 100f percentage point of the t distribution with n degrees of freedom.

Bonferroni Method: The Bonferroni method is a conservative method based on the Bonferroni (additive) inequality (Miller, p. 8). The method uses the t distribution. The formula for the difference μi – μj is given by the following:

Scheffé Method: The Scheffé method is an overly conservative method for simultaneous confidence intervals on pairwise difference of means. The method is applicable for simultaneous confidence intervals on all contrasts, i.e., all linear combinations:

where the following is true:

This method can be recommended here only if a large number of confidence intervals on contrasts, in addition to the pairwise differences of means, are to be constructed. The method uses the F distribution (see IMSL_FCDF. The formula for the difference μi – μj is given by the following:

where:

is the (1 – a) 100 percentage point of the F distribution with k – 1 and n degrees of freedom.

One-at-a-Time t Method (Fisher's LSD): The One-at-a-Time t method is appropriate for constructing a single confidence interval. The confidence percentage input is appropriate for one interval at a time. The method has been used widely in conjunction with the overall test of the null hypothesis μ1 = μ2 = ... = μk by the use of the F statistic. Fisher's LSD (least significant difference) test is a two-stage test that proceeds to make pairwise comparisons of means only if the overall F test is significant. Milliken and Johnson (1984, p. 31) recommend LSD comparisons after a significant F only if the number of comparisons is small and the comparisons were planned prior to the analysis. If many unplanned comparisons are made, they recommend Scheffé's method. If the F test is insignificant, a few planned comparisons for differences in means can still be performed by using either Tukey, Tukey-Kramer, Dunn-Sidak or Bonferroni methods. Because the F test is insignificant, Scheffé's method does not yield any significant differences. The formula for the difference μi – μj is given by the following:

Examples


Example 1

This example computes a one-way analysis of variance for data discussed by Searle (1971, Table 5.1, pp. 165–179). The responses are plant weights for six plants of three different types shown in the following table: three normal, two off-types, and one aberrant.

Normal

Off-Type

Aberrant

101

84

32

105

88

94

 

n = [3,2,1]
y = [101.0, 105.0, 94.0, 84.0, 88.0, 32.0]
PRINT,';p-value = ', IMSL_ANOVA1(n, y)

IDL prints:

p-value = 0.00276887

Example 2: Multiple Comparisons

Simultaneous confidence intervals are generated for the measurements of cold-cranking power for five models of automobile batteries shown in the table below. Nelson (1989, pp. 232–241) provided the data and approach.

Model 1

Model 2

Model 3

Model 4

Model 5

41

42

27

48

28

43

43

26

45

32

42

46

28

51

37

46

38

27

46

25

The Tukey method is chosen for the analysis of pairwise comparisons, with a confidence level of 99 percent. The means and their confidence limits are output. First, a procedure to print out the results is defined.

.RUN
PRO print_results, anova_table, diff_means
  anova_labels = [';df for among groups', $
    ';df for within groups', 'total (corrected) df', $
    ';ss for among groups', 'ss for within groups', $
    ';total (corrected) ss', 'mean square among groups', $
    ';mean square within groups', 'F-statistic', $
    ';P-value', 'R-squared (in percent)', $
    ';adjusted R-squared (in percent)', $
    ';est. std of within group error', 'overall mean of y', $
    ';coef. of variation (in percent)']
  PRINT, '; * *Analysis of Variance * *'
  FOR i = 0, 14 DO PM, anova_labels(i), $
    anova_table(i), FORMAT = ';(a40,f20.2)'
  PRINT
  ; Print the analysis of variance table.
  PRINT, '; * *Differences of Means * *'
  PRINT, ';groups', 'difference', 'lower limit', 'upper limit'
  PM, diff_means, FORMAT = ';(2i3, x, f9.2, 4x, f9.2, 5x, f9.2)'
  ; Print the differences of means.
END
 
n = [4, 4, 4, 4, 4]
y = [41, 43, 42, 46, 42, 43, 46, 38, 27, 26, 28, 27, $
  48, 45, 51, 46, 28, 32, 37, 25]
p_value = IMSL_ANOVA1(n, y, Confidence = 99.0, $
  Anova_Table = anova_table, Tukey = diff_means)
; Call IMSL_ANOVA1.
print_results, anova_table, diff_means
 
; Output the results.
 
* *Analysis of Variance * *
  df for among groups                    4.00
  df for within groups                  15.00
  total (corrected) df                  19.00
  ss for among groups                 1242.20
  ss for within groups                 150.75
  total (corrected) ss                1392.95
  mean square among groups             310.55
  mean square within groups             10.05
  F-statistic                           30.90
  P-value                                0.00
  R-squared (in percent)                89.18
  adjusted R-squared (in percent)       86.29
  est. std of within group error         3.17
  overall mean of y                     38.05
  coef. of variation (in percent)        8.33
* *Differences of Means * *
groups   difference   lower limit   upper limit
  1  2      0.75      -8.05       9.55
  1  3     16.00       7.20      24.80
  1  4     -4.50     -13.30       4.30
  1  5     12.50       3.70      21.30
  2  3     15.25       6.45      24.05
  2  4     -5.25     -14.05       3.55
  2  5     11.75       2.95      20.55
  3  4    -20.50     -29.30     -11.70
  3  5     -3.50     -12.30       5.30
  4  5     17.00       8.20      25.80

Syntax


Result = IMSL_ANOVA1(N, Y[, ANOVA_TABLE=variable] [, BONFERRONI=variable] [, CONFIDENCE=value] [, /DOUBLE] [, DUNN_SIDAK=variable] [, GROUP_COUNTS=variable] [, GROUP_MEANS=variable] [, GROUP_STD_DEV=variable] [, ONE_AT_A_TIME=variable] [, SCHEFFE=variable] [, TUKEY=variable])

Return Value


The p-value for the F-statistic.

Arguments


N

One-dimensional array containing the number of responses for each group.

Y

One-dimensional array of length:

n(0) + n(1) + ...+ n(N_ELEMENTS(n) – 1)

containing the responses for each group.

Keywords


ANOVA_TABLE (optional)

Named variable into which the analysis of variance table is stored. The analysis of variance statistics are as follows:

  • 0: Degrees of freedom for the model
  • 1: Degrees of freedom for error
  • 2: Total (corrected) degrees of freedom
  • 3: Sum of squares for the model
  • 4: Sum of squares for error
  • 5: Total (corrected) sum of squares
  • 6: Model mean square
  • 7: Error mean square
  • 8: Overall F-statistic
  • 9: p-value
  • 10: R2 (in percent)
  • 11: Adjusted R2 (in percent)
  • 12: Estimate of the standard deviation
  • 13: Overall mean of y
  • 14: Coefficient of variation (in percent)

BONFERRONI (optional)

Named variable into which the array containing the statistics relating to the difference of means is stored. On return, the named variable contains an array of size:

where ngroups = N_ELEMENTS (n).

  • 0: Group number for the i-th mean
  • 1: Group number for the j-th mean
  • 2: Difference of means (i-th mean) - (j-th mean)
  • 3: Lower confidence limit for the difference
  • 4: Upper confidence limit for the difference

The IMSL_ANOVA1 function computes confidence intervals on all pairwise differences of means using one of six methods: Tukey, Tukey-Kramer, Dunn-Sidák, Bonferroni, Scheffé, or Fisher's LSD (One-at-a-Time). If Tukey is specified, Tukey confidence intervals are calculated if the group sizes are equal; otherwise, the Tukey-Kramer confidence intervals are calculated.

CONFIDENCE (optional)

Confidence level for the simultaneous interval estimation. If Tukey is specified, Confidence must be in the range [90.0, 99.0); otherwise, Confidence is in the range [0.0, 100.0). Default: 95.0

DOUBLE (optional)

If present and nonzero, then double precision is used.

DUNN_SIDAK (optional)

Named variable into which the array containing the statistics relating to the difference of means is stored. On return, the named variable contains an array of size:

where ngroups = N_ELEMENTS (n).

  • 0: group number for the i-th mean
  • 1: group number for the j-th mean
  • 2: difference of means (i-th mean) - (j-th mean)
  • 3: lower confidence limit for the difference
  • 4: upper confidence limit for the difference

The IMSL_ANOVA1 function computes confidence intervals on all pairwise differences of means using one of six methods: Tukey, Tukey-Kramer, Dunn-Sidák, Bonferroni, Scheffé, or Fisher's LSD (One-at-a-Time). If Tukey is specified, Tukey confidence intervals are calculated if the group sizes are equal; otherwise, the Tukey-Kramer confidence intervals are calculated.

GROUP_COUNTS (optional)

Named variable into which the array containing the number of nonmissing observations for the groups is stored.

GROUP_MEANS (optional)

Named variable into which the array containing the group means is stored.

GROUP_STD_DEV (optional)

Named variable into which the array containing the group standard deviations is stored.

ONE_AT_A_TIME (optional)

Named variable into which the array containing the statistics relating to the difference of means is stored. On return, the named variable contains an array of size:

where ngroups = N_ELEMENTS (n).

  • 0: group number for the i-th mean
  • 1: group number for the j-th mean
  • 2: difference of means (i-th mean) - (j-th mean)
  • 3: lower confidence limit for the difference
  • 4: upper confidence limit for the difference

The IMSL_ANOVA1 function computes confidence intervals on all pairwise differences of means using one of six methods: Tukey, Tukey-Kramer, Dunn-Sidák, Bonferroni, Scheffé, or Fisher's LSD (One-at-a-Time). If Tukey is specified, Tukey confidence intervals are calculated if the group sizes are equal; otherwise, the Tukey-Kramer confidence intervals are calculated.

SCHEFFE (optional)

Named variable into which the array containing the statistics relating to the difference of means is stored. On return, the named variable contains an array of size:

where ngroups = N_ELEMENTS (n).

  • 0: group number for the i-th mean
  • 1: group number for the j-th mean
  • 2: difference of means (i-th mean) - (j-th mean)
  • 3: lower confidence limit for the difference
  • 4: upper confidence limit for the difference

The IMSL_ANOVA1 function computes confidence intervals on all pairwise differences of means using one of six methods: Tukey, Tukey-Kramer, Dunn-Sidák, Bonferroni, Scheffé, or Fisher's LSD (One-at-a-Time). If Tukey is specified, Tukey confidence intervals are calculated if the group sizes are equal; otherwise, the Tukey-Kramer confidence intervals are calculated.

TUKEY (optional)

Named variable into which the array containing the statistics relating to the difference of means is stored. On return, the named variable contains an array of size:

where ngroups = N_ELEMENTS (n).

  • 0: Group number for the i-th mean
  • 1: Group number for the j-th mean
  • 2: Difference of means (i-th mean) - (j-th mean)
  • 3: Lower confidence limit for the difference
  • 4: Upper confidence limit for the difference

The IMSL_ANOVA1 function computes confidence intervals on all pairwise differences of means using one of six methods: Tukey, Tukey-Kramer, Dunn-Sidák, Bonferroni, Scheffé, or Fisher's LSD (One-at-a-Time). If Tukey is specified, Tukey confidence intervals are calculated if the group sizes are equal; otherwise, the Tukey-Kramer confidence intervals are calculated.

Version History


6.4

Introduced