This site is no longer maintained and has been left for archival purposes

Text and links may be out of date


A multiple range test for comparing means in an analysis of variance

This procedure is broadly similar to that for calculating the LSD but it gives us more confidence in comparing the means within a table.

In calculating LSD, we first found sd2 (= 2 x residual mean square / n) and from this we found sd (the standard deviation of the difference between any two means) and multiplied it by a t value (for the degrees of freedom of the residual mean square). For a multiple range test, we use essentially the same procedure but instead of a t value we use a Q value obtained from a table "The Studentized Range". We list our means in order of magnitude, from highest to lowest, then we test for significant difference between the highest and lowest - it must be greater than Qxsd). If this is significant, we test the highest against the second lowest mean, and continue in this way until all means have been tested against one another. However the Q value changes each time. For the first test (highest against lowest mean) we look up the Q value for the number of treatments (i.e. for the number of means in our table of results). For the next test (second highest against lowest, we use the Q value for the number of means minus 1 (because we are excluding the highest mean now), and so on. However the degrees of freedom does not change - it is always the df of the residual (error) mean square. Although each step in this procedure is simple, you need to be organised for testing each mean against all others - see a worked example for the best way to do this.

Having done an analysis like this, most people construct a table as follows, using letters to show which treatments differ from others. They would say that treatments that are not followed by the same letter differ significantly from one another (P, 0.05).

For example, in the fictitious table below, the means for pH 4 and 5 do not differ from one another but differ from the means for all other pH values. The mean for pH 6 differs from the means at all other pH values. The means for pH 7 and 8 do not differ from one another but differ from all other means, and the mean for pH 9 differs from all others.

Treatment Mean
pH 4 36 a
pH 5 35 a
pH 6 30 b
pH 7 20 c
pH 8 17 c
pH 9 10 d

We will now apply the Multiple Range Test to the data on bacterial biomass that we analysed by ANOVA

Summary table of data:

  Bacterium A Bacterium B Bacterium C
Replicate 1 12 20 40
Replicate 2 15 19 35
Replicate 3 9 23 42
n 3 3 3
12 20.7 39
Source of variance Sum of squares Degrees of freedom Mean square
Between treatments 1140.2 2 570.1
Residual 52.7 6 8.78
Total 1192.9 8  

sd2 = 2 x residual mean square / n = 17.56 /3 = 5.85.

sd = 2.42 (obtained as the square root of sd2).

First: Rank the treatments from highest to lowest mean. Then compare the highest (39) with the lowest (12) mean:

With 3 treatments, and 6 degrees of freedom for the residual mean square, we have a Q value of 4.34. So, Q(sd) = 4.34 x 2.42 = 9.72. The difference between the highest and lowest means is greater than this, so the biomass of Bacterium C (the highest) differs significantly from that of Bacterium A (the lowest).

Second: compare the second highest (20.7) with the lowest (12) mean:

With 2 treatments (because we are excluding the highest mean now), and 6 degrees of freedom for the residual mean square, we have a Q value of 3.46. So, Q(sd) = 3.46 x 2.42 = 8.37. The difference between means is greater than this (but only just), so the biomass of Bacterium B differs significantly from Bacterium A.

Third: continue in this way, down the table of ranked means, until you get a non-significant result. [In our case, we have reached the end because there are only 3 treatments]

Fourth: [This is not necessary in our case because we ran out of means!] Compare the second lowest mean with the highest, then continue with the second lowest and second highest, etc.

Comparison of the Multiple Range Test with the Least Significant Difference

It is interesting to compare the findings of these two types of test. Using the LSD method, we found that the least significant difference between any two means would need to be t(sd ) = 2.45 x 2.42 = 5.92.

Using the Multiple Range Test, we had to meet stricter criteria: the highest and lowest means had to differ by 9.72, and the second-highest and lowest means had to differ by 8.37. If we had had more means to compare (e.g. the third-highest and lowest) then the critical value would have been reduced again. The Multiple Range Test is much more discriminating that the LSD - it greatly reduces the chance of error when making multiple comparisons between treatments.

Suggested procedure for comparing all means

Suppose that we did an ANOVA on 5 treatments, with 4 replicates per treatment, and found means of 36, 42, 74, 10, 80. We calculated sd as 2.50. With 5 treatments and 4 replicates we have 15 degrees of freedom for the residual (error) mean square.

We will need the Q values for comparing 5, then 4, then 3, then 2 treatments (always with 16 df). It is sensible to make a small table:

No. of treatments 5 4 3 2
Q = 4.37 4.08 3.67 3.01
Qxsd = 10.925 10.20 9.175 7.525

Now we can rank our means from highest to lowest and see if they differ by Qxsd. Again make a table (see below). Start in box 1, comparing the means 80 and 10. These differ by more than 10.925 (Qxsd for 5 treatments) so we insert * in box 1. Repeat this for box 2 (comparing means 80 and 36, using Q for 4 treatments). Again insert *. We would also get a significant difference in box 3 (means 80 and 42) but not in box 4, so we insert ns (not significant).

Now start in the next column (boxes 5, 6 and 7) then the third column (boxes 8, 9) and the fourth column.

74         box 4 ns  
42       box 7 * box 3 *  
36     box 9 ns box 6 * box 2 *  
10   box 10 * box 8 * box 5 * box 1 *  
  10 36 42 74 80 Mean

Finally, we put letters against the means to show significant differences. We state that: means followed by the same letter do not differ significantly from one another (p = 0.05). Do this in a series of steps as shown below, to reach the letters (shown in blue) in the final column. These are the letters that we retain from the early steps if they are not contradicted by a later step. [Of course, having done this, we can present our means in any order, with their letters; we do not need to keep them in ranked order]

Means Step 1
(from boxes 1-4)
Step 2
(from boxes 5-7)
Step 3
(boxes 8,9,10)
80 a     a
74 a c   a
42 b d e b
36 b d e b
10 b d f d

Transformation of data

During ANOVA we do an Fmax test to check for homogeneity of variance, i.e. to check that it is safe to pool all the treatment variances - an essential condition for performing an Analysis of Variance.

What should we do if the Fmax test shows a major discrepancy in the variances, thereby invalidating ANOVA?

The answer is to use some mathematical transformation of the original data, then perform ANOVA with the transformed data. There are several types of transformation, each most appropriate for particular circumstances.

1. When our data consist of small, whole-numbered counts the variance is often proportional to the mean. This is overcome by converting each value (X) to X and analysing the X data. If the counts are low and contain zeros then use (X + 0.5).

2. More generally, it is appropriate to use log10 X, or log10 (X+1) if there are zero values.

3. Percentages and proportions (after multiplying by 100) can be converted to arcsin values.

In all these cases the transformed data are analysed in exactly the same way as in a normal ANOVA, and we can use LSD or a multiple range test, as we did before, to test for significant differences between the treatment means. BUT remember that these tests tell us the difference between the transformed values, and it is not valid to de-transform an LSD and show it as a significant difference between the ‘true’ means. This problem does not arise with a multiple range test, where we use letters to show significant differences.

The way to overcome this is to present the data in a table as follows, showing both the true means and the transformed means, and the LSD that applies to the transformed means:

Treatment Mean (with log10 (X+1), arcsin or in parentheses)
1 20 (4.47)
2 10 (3.16)
3 15 (3.87)
5% LSD (0.37)

Experimental design
Designing experiments with statistics in mind
Common statistical terms
Descriptive statistics: standard deviation, standard error, confidence intervals of mean.


Student's t-test for comparing the means of two samples
Paired-samples test. (like a t-test, but used when data can be paired)
Analysis of variance for comparing means of three or more samples:

Chi-squared test for categories of data
Poisson distribution for count data
Correlation coefficient and regression analysis for line fitting:

TRANSFORMATION of data: percentages, logarithms, probits and arcsin values

t (Student's t-test)
F, p = 0.05 (Analysis of Variance)
F, p = 0.01 (Analysis of Variance)
F, p = 0.001 (Analysis of Variance)
c2 (chi squared)
r (correlation coefficient)
Q (Multiple Range test)
Fmax (test for homogeneity of variance)


This site is no longer maintained and has been left for archival purposes

Text and links may be out of date