10. Factorial Experiments

A factor is a discrete variable used to classify experimental units. For example, ”Gender” might be a factor with two levels “male” and “female” and “Diet” might be a factor with three levels “low”, “medium” and “high” protein. The levels within each factor can be discrete, such as “Drug A” and “Drug B”, or they may be quantitative such as 0, 10, 20 and 30 mg/kg.

 A factorial design is one involving two or more  factors in a single experiment. Such designs are classified by the number of levels of each factor and the number of factors. So a 2x2 factorial will have two levels or two factors and a 2x3 factorial will have three factors each at two levels.

Typically, there are many factors such as gender, genotype, diet, housing conditions, experimental protocols, social interactions and age which can  influence the outcome of an experiment. These often need to be investigated in order to determine the generality of a response. It may be important to know whether a response is only seen in, say, females but not males. One way to do this would be to do separate experiments in each sex. This “OVAT” or “One Variable at A Time” approach is, however, very wasteful of scientific resources. A much better alternative is to include both sexes or more than one strain etc. in a single “factorial” experiment. Such designs can include several factors without using excessive numbers of experimental subjects.

Factorial designs are efficient and provide extra information (the interactions between the factors),  which can not be obtained when using single factor designs.

Split plot designs are considered at the end of this section. They are like a cross between a factorial and a randomised block design. They were derived from agricultural research in which is was sometimes impossible to irrigate, say, a small plot without affecting the adjacent plots. So irrigated plots were large and covered several smaller plots comparing, say, planting distance.

According to RA Fisher

If the investigator confines his attention to any single factor we may infer either that he is the unfortunate victim of a doctrinaire theory as to how experimentation should proceed, or that the time, material or equipment at his disposal is too limited to allow him to give attention to more than one aspect of his problem.....

.... Indeed in a wide class of cases (by using factorial designs) an experimental investigation, at the same time as it is made more comprehensive, may also be made more efficient if by more efficient we mean that more knowledge and a higher degree of precision are obtainable by the same number of observations.”

(Fisher RA. 1960. The design of experiments. New York: Hafner Publishing Company, Inc. 248 p.)

Unfortunately, although such designs are widely used, they are often incorrectly analysed. A survey found the following:

Number of studies      513
Factorial designs      153 (30%)
Correctly analysed     78 (50%) 

(Niewenhuis et al., 2011, Nature Neurosci. 14:1105)


Assuming that the animal is the experimental unit, the experiment on the right has two factors, the treatment (Control vefact102rsus Treated represented by the two columns) and the colour (White versus Green). This might represent the two sexes, or two strains or two diets or any other factor of possible interest.

The aim is usually to see whether two factors are independent.

This is a 2x2 factorial because there are two factors each at two levels. Using the Resource Equation method of sample size determination there are 16 animals and 4 groups, so E=16-4=12, which, though small, is satisfactory.

Factorial designs are powerful because differences among the levels of each factor are determined by  averaging across all other factors. This, if columns in the figure on the right represent “Treated” and “Control” the means are estimated by averaging across the two colours which might represent males and females. This assumes that the males and females respond in the same way to the treatment, an assumption that is tested in the statistical analysis using a two-way analysis of variance with interaction

If the two sexes do not respond in the same way then this is known as an “interaction” and the differences will need to be looked at separately for each sex. However, this would be useful information which could not be obtained by doing separate experiments on each sex.

A 3x3 Factorial design (3 factors each at 3 levels) is shown below. .fact2 This might be, for example, a “Drug treatment” with levels Control, Low high doses (columns) and “Diet” with three levels of a food additive represented by the three colours








A 3x3x2 factorial is shown on the right.

Here the three factors are “Dose” at three levels, “Diet” at three levels and “Strain” (stripes versus solid) at two levels. So it is a 3x3x2 factorial design.

In this case there are 36 experimental units (animals) and 18 treatment groups so using the Resource Equation method of determining sample size, E=36-18 =18. As E is between 10 and 20 it is probably an appropriate number of experimental units.

Note that with factorial designs the concept of “group size” needs to be reconsidered. In this case each treatment and diet mean will be based  on 12 subjects, averaged across the other factors. Strain means will be based on 18 animals averaged across both diets and treatments. So although there are subgroups consisting of just two animals, the means are based on much larger numbers.

Afact3 real example.



In this study mice of two strains (BALB/c and C57BL) were dosed with a vehicle or with chloramphenicol at 2000mg/kg. This is a 2(strains) x 2(dose levels) factorial design. We want to know:

  • does treatment have an effect on RBC counts
  • do strains differ in RBC counts
  • do strains differ in their response to chloramphenicol (the interaction).

The treatment appears to reduce red blood cell (RBC) counts. There is no overlap between treated and control individuals. Also, C57BL seems to have lower counts than BALB/c. Whether or not there is an interaction can best be seen graphicallyfactorial1

This plot shows that BALB/c  (triangles) mice have higher red blood counts than the C57BL (circles) both in the controls and in the treated group and the reduction due to the chloramphenicol was the same in both strains. So there is no interaction between strain and chloramphenicol in this case. This should, of course, be confirmed by a two-way analysis of variance with interaction as described in section 11.


fact5 In contrast, here are the results with two different strains (C3H and outbred CD-1). Chloramphenicol seems to reduce red blood cell counts and CD-1 seems to have higher counts than C3H. However, plotting the means (below) also shows that there is an interaction.


factorial2Strain C3H (triangles) responded to chloraphenicol by a reduction in red blood cell counts, but in CD-1 (circles) there was no response.

The data should be analysed by a two-way ANOVA with interaction to see whether the interaction effect is statistically significant, as shown in section 11.



Implications of strain x treatment interactions

Strain by treatment interactions are almost universal. This means that results based on a single strain (or outbred stock) can not necessarily be generalised.

It is often highly desirable to replicate over several strains using a factorial experimental design, particularly in toxicity testing where the aim is to “prove a negative”. A good example is the response to bisphenol A (BPA) which is an endocrine disruptor in most strains and stocks of mice and rats, causing a range of developmental and other defects when administered at doses below the “safe” human exposure level. However in none of 13 studies were any effects observed when the CD:SD stock of rats was used. (Richter CA, Birnbaum LS, Farabollini F et al. In vivo effects of bisphenol A in laboratory rodent studies. Reprod Toxicol 2007;24:199-224.).

design11Split plot designs

These are randomised block designs with a factorial treatment structure in which a main effect is confounded with blocks. They are worth knowing about because  in some situations they may make efficient use of resources.

Suppose the aim is to compare two or more treatments using a randomised block design. For example, the experiment on the right has two animals in a cage, each receiving a different treatment. They could be two genotypes. Or it could be a within-animal experiment where the same animal is given two treatments sequentially or as a topical application to the skin on the left or right side.

Suppose it was decided that 12 cages would be sufficient (by the resource equation E=24-2=22, which is acceptable). Outcome measurements would be done on the animals, one cage at a time. This could be analysed as a randomised block design.

But suppose half the cages had males and half females? In that case an estimate of sex differences for the outcome of interest could be obtained averaging across the two animals within a cage  and a sex x treatment effect could also be estimated. This would indicate whether the two sexes responded similarly to the treatments.

Given the need to use both sexes and the need to group house rodents, this might be quite a useful design in some cases.

A “split-plot” has two different experimental units (in this case animal (for comparing the treatments) and cage (for comparing the sexes) in one experiment. Technically, it is a factorial design with a main effect confounded with blocks. The statistical analysis will be discussed in the statistical analysis pages.


Home Top Test yourself