This site is no longer maintained and has been left for archival purposes

Text and links may be out of date

TWO-WAY ANOVA

Analysis of variance (ANOVA) for factorial combinations of treatments

Elsewhere on this site we have dealt with ANOVA for simple comparisons of treatments. We can also use ANOVA for combinations of treatments, where two factors (e.g. pH and temperature) are applied in every possible combination. These are called factorial designs, and we can analyse them even if we do not have replicates.

This type of analysis is called TWO-WAY ANOVA.

Suppose that we have grown one bacterium in broth culture at 3 different pH levels at 4 different temperatures. We have 12 flasks in all, but no replicates. Growth was measured by optical density (O.D.).

Construct a table as follows (O.D. is given in fictitious whole numbers here for convenience).

Temp oC pH 5.5 pH 6.5 pH 7.5
25 10 19 40
30 15 25 45
35 20 30 55
40 15 22 40

Then calculate the following (see the worked example and the output from Microsoft "Excel").

(a) S x, S x2, (S x)2 / n, and for each column in the table.

(b) S x, S x2, (S x)2 / n, and for each row.

(c) Find the grand total by adding all S x for columns (it should be the same for rows). Square this grand total and then divide by uv, where u is the number of data entries in each row, and v is number of data entries in each column. Call this value D; in our example it is (336)2 12 = 9408.

(d) Find the sum of S x2 values for columns; call this A. It will be the same for S x2 of rows. In our example it is 11570.

(e) Find the sum of S x2/n values for columns; call this B. In our example it is 11304.

(f) Find the sum of S x2/n values for rows; call this C. In our example it is 9646.

(g) Set out a table of analysis of variance as follows:

Source of variance Sum of squares Degrees of freedom* Mean square
(= S of S df)
Between columns B - D (1896) u - 1 (=2) 948
Between rows C - D (238) v - 1 (= 3) 79.3
Residual *** (28) (u-1)(v-1) (=6) 4.67
Total A - D (2162) (uv)-1 (=11) 196.5

[* Where u is the number of data entries in each row, and v is the number of data entries in each column); note that the total df is always one fewer than the total number of entries in the table of data.

*** Obtained by subtracting the between-columns and between-rows sums of squares from total sum of squares.

Now do a variance ratio test to obtain F values:

(1) For between columns (pH): F = Between columns mean square / Residual mean square

= 948 / 4.67 = 203

(2) For between rows (temperature) F = Between rows mean square / Residual mean square

= 79.3 / 4.67 = 17.0

In each case, consult a table of F (p = 0.05 or p = 0.01 or p = 0.001) where u is the between-treatments df (columns or rows, as appropriate) and v is residual df. If the calculated F value exceeds the tabulated value then the treatment effect (temperature or pH) is significant. In our example, for the effect of pH (u is 2 degrees of freedom, v is 6 df) the critical F value at p = 0.05 is 5.14. In fact, we have a significant effect of pH at p = 0.001. For the effect of temperature (u is 3 degrees of freedom, v is 6 df) the critical F value at p = 0.05 is 4.76. We find that the effect of temperature is significant at p = 0.01.

Worked example:

  pH 5.5 pH 6.5 pH 7.5 S x, Rows n (=u) S x2 (S x)2 / n
25oC 10 19 40 69 3 23 2061 1587
30oC 15 25 45 85 3 28.33 2875 2408
35oC 20 30 55 105 3 35 4325 3675
40oC 15 22 40 77     2309 1976
S x, Columns 60 96 180 336 grand total     Total C 9646
n (= v) 4 4 4        
15 24.67 46.67        
S x2 950 2370 8250     Total A 11570  
(S x)2 / n 900 2304 8100 Total B
11304
 

Below, we see a print-out of this analysis from "Excel".

We select Anova: Two-Factor Without Replication from the analysis tools package. Note that the Anova table gives Source of Variation separately for Rows, Columns and Error (= Residual).

 

pH 5.5

pH 6.5

pH 7.5

     

25oC

10

19

40

     

30oC

15

25

45

     

35oC

20

30

55

     

40oC

15

22

40

     
             
Anova: Two-Factor Without Replication      

SUMMARY

Count

Sum

Average

Variance

   
Row 1

3

69

23

237

   
Row 2

3

85

28.33333

233.3333

   
Row 3

3

105

35

325

   
Row 4

3

77

25.66667

166.3333

   
             
Column 1

4

60

15

16.66667

   
Column 2

4

96

24

22

   
Column 3

4

180

45

50

   
ANOVA            

Source of Variation

SS

df

MS

F

P-value

F crit

Rows

238.6667

3

79.55556

17.46341

0.00228

4.757055

Columns

1896

2

948

208.0976

2.87E-06

5.143249

Error

27.33333

6

4.555556

     
Total

2162

11

       

Of interest, another piece of information is revealed by this analysis - the effects of temperature do not interact with effects of pH. In other words, a change of temperature does not change the response to pH, and vice-versa.We can deduce this because the residual (error) mean square (MS) is small compared with the mean squares for temperature (columns) or pH (rows). [A low residual mean square tells us that most variation in the data is accounted for by the separate effects of temperature and pH].

But suppose that our data were as follows:

Temp oC pH 5.5 pH 6.5 pH 7.5
25 10 19 40
30 15 25 30
35 20 30 25
40 25 22 10

Here an increase of temperature increases growth at low pH but decreases growth at high pH. If we analysed these data we would probably find no significant effect of temperature or pH, because these factors interact to influence growth. The residual mean square would be very large. This type of result is not uncommon - for example, patients' age might affect their susceptibility to levels of stress. Inspection of our data strongly suggests that there is interaction. To analyse it, we would need to repeat the experiment with two replicates, then use a slightly more complex analysis of variance to test for (1) separate temperature effects, (2) separate pH effects, and (3) significant effects of interaction.

As an example, below is shown a print-out from "Excel" of the following table, where I have assumed that we did the experiment above with replication.

Temp oC pH 5.5 pH 6.5 pH 7.5
25 rep 1 9 18 36
rep 2 11 20 44
30 rep 1 13 23 27
rep 2 17 27 33
35 rep 1 18 27 23
rep 2 22 33 27
40 rep 1 22 20 7
rep 2 28 24 13

The procedure in "Excel" is as follows.

1. Enter the replicates as separate rows.

2. From the analysis tools menu, choose Anova: Two-Factor with Replication.

3. Insert all the cells of the table in Input range (Anova assumes that column A and

row 1 are used for headings).

4. Enter "2" (in our case) where asked for "Rows per sample".

In the table displayed on the screen (see below) the analysis shows the means for each temperature and each pH. It also tells us the following (see the bottom rows of the table).

(i) There is no significant difference between temperatures overall ("Excel" has called the temperature "Sample") because the calculated F value (3.148) is less than the critical F value (3.49).

(ii) There is very highly significant (p = 0.0008) effect of pH ("Columns") overall.

(iii) There is very highly significant interaction (p = 0.0000397) between temperature and pH. In other words, the response to pH depends on the temperature, or vice-versa. This might have been the purpose of doing the experiment - to see how the organism behaves when subjected to combinations of factors.

Temp

pH 5.5

pH 6.5

pH 7.5

     

25oC

9

18

36

     

25oC

11

20

44

     

30oC

13

23

27

     

30oC

17

27

33

     

35oC

18

27

23

     

35oC

22

33

27

     

40oC

22

20

7

     

40oC

28

24

13

     
Anova: Two-Factor With Replication      
SUMMARY pH 5.5 pH 6.5 pH 7.5 Total    
25oC Count

2

2

2

6

   
Sum

20

38

80

138

   
Average

10

19

40

69

   
Variance

2

2

32

36

   
             
30oC Count

2

2

2

6

   
Sum

30

50

60

140

   
Average

15

25

30

70

   
Variance

8

8

18

34

   
             
35oC Count

2

2

2

6

   
Sum

40

60

50

150

   
Average

20

30

25

75

   
Variance

8

18

8

34

   
             
40oC Count

2

2

2

6

   
Sum

50

44

20

114

   
Average

25

22

10

57

   
Variance

18

8

18

44

   
             
Total Count

8

8

8

     
Sum

140

192

210

     
Average

70

96

105

     
Variance

36

36

76

     
             
ANOVA            

Source

SS

df

MS

F

P-value

F crit

Sample 116.5

3

38.83333 3.148649 0.064794 3.4903
Columns 330.3333

2

165.1667 13.39189 0.000877 3.88529
Interaction 1203

6

200.5 16.25676 3.97E-05 2.996117
Within 148

12

12.33333      
Total 1797.833

23

       

Note: Because there is so much interaction, it is difficult to analyse the separate effects of temperature and pH. We should repeat the analysis, using separate parts of the data. For example, ANOVA for all the pH treatments at 25oC , then at 30oC, then 35oC and 40oC. But we could assemble all the means (there are 12) in ranked order and do a multiple range test to find significant differences.

CONTENTS

INTRODUCTION
THE SCIENTIFIC METHOD
Experimental design
Designing experiments with statistics in mind
Common statistical terms
Descriptive statistics: standard deviation, standard error, confidence intervals of mean.

WHAT TEST DO I NEED?

STATISTICAL TESTS:
Student's t-test for comparing the means of two samples
Paired-samples test. (like a t-test, but used when data can be paired)
Analysis of variance for comparing means of three or more samples:

Chi-squared test for categories of data
Poisson distribution for count data
Correlation coefficient and regression analysis for line fitting:

TRANSFORMATION of data: percentages, logarithms, probits and arcsin values

STATISTICAL TABLES:
t (Student's t-test)
F, p = 0.05 (Analysis of Variance)
F, p = 0.01 (Analysis of Variance)
F, p = 0.001 (Analysis of Variance)
c2 (chi squared)
r (correlation coefficient)
Q (Multiple Range test)
Fmax (test for homogeneity of variance)

 

 

 

This site is no longer maintained and has been left for archival purposes

Text and links may be out of date

Accessibility Statement