You can use multiple **contrast** statements in a **proc glm** call to conduct tests of simple main
effects. This is particularly useful when exploring the interaction of
three categorical variables in ANOVA. If you are not familiar with three-way interactions in ANOVA, please see our
general FAQ on
understanding three-way interactions in ANOVA. In short, a three-way
interaction means that
there is a two-way interaction that varies across levels of a third variable. Say, for
example, that a b*c interaction differs across various levels of factor **a**.

One way of analyzing the three-way interaction is through the use of tests of simple main-effects, e.g., the effect of one variable (or set of variables) across the levels of another variable.

We will use a small artificial dataset called threeway that has a statistically significant three-way interaction
to illustrate the process. In our example data set, variables **a**, **b** and
**c**
are categorical. The techniques shown on this page can be generalized to
situations in which one or more variables are continuous, but the more
continuous variables that are involved in the interaction, the more complicated
things get.

The results (shown below) indicate that the b*c interaction is statistically
significant at a=1 but not at a=2. Because of this, the second two **
contrast **statements are needed; these show the effect of **c** at a=1 at both
levels of **b**.

After we look at the results, we will look at the coding used.

proc glm data = threeway; class a b c; model y = a b c a*b a*c b*c a*b*c; contrast 'b*c at a=1' b*c 1 0 -1 -1 0 1 a*b*c 1 0 -1 -1 0 1 0 0 0 0 0 0, b*c 0 1 -1 0 -1 1 a*b*c 0 1 -1 0 -1 1 0 0 0 0 0 0; contrast 'b*c at a=2' b*c 1 0 -1 -1 0 1 a*b*c 0 0 0 0 0 0 1 0 -1 -1 0 1, b*c 0 1 -1 0 -1 1 a*b*c 0 0 0 0 0 0 0 1 -1 0 -1 1; contrast 'c at a=1 & b=1' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 1 0 -1 0 0 0 a*b*c 1 0 -1 0 0 0 0 0 0 0 0 0, c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 1 -1 0 0 0 a*b*c 0 1 -1 0 0 0 0 0 0 0 0 0; contrast 'c at a=1 & b=2' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 0 0 0 1 0 -1 a*b*c 0 0 0 1 0 -1 0 0 0 0 0 0, c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 0 0 0 1 -1 a*b*c 0 0 0 0 1 -1 0 0 0 0 0 0; run; quit;

The GLM Procedure Class Level Information Class Levels Values A 2 1 2 B 2 1 2 C 3 1 2 3 Number of Observations Read 24 Number of Observations Used 24

Dependent Variable: Y Sum of Source DF Squares Mean Square F Value Pr > F Model 11 497.8333333 45.2575758 33.94 <.0001 Error 12 16.0000000 1.3333333 Corrected Total 23 513.8333333 R-Square Coeff Var Root MSE Y Mean 0.968861 7.655473 1.154701 15.08333 Source DF Type I SS Mean Square F Value Pr > F A 1 150.0000000 150.0000000 112.50 <.0001 B 1 0.6666667 0.6666667 0.50 0.4930 C 2 127.5833333 63.7916667 47.84 <.0001 A*B 1 160.1666667 160.1666667 120.12 <.0001 A*C 2 18.2500000 9.1250000 6.84 0.0104 B*C 2 22.5833333 11.2916667 8.47 0.0051 A*B*C 2 18.5833333 9.2916667 6.97 0.0098 Source DF Type III SS Mean Square F Value Pr > F A 1 150.0000000 150.0000000 112.50 <.0001 B 1 0.6666667 0.6666667 0.50 0.4930 C 2 127.5833333 63.7916667 47.84 <.0001 A*B 1 160.1666667 160.1666667 120.12 <.0001 A*C 2 18.2500000 9.1250000 6.84 0.0104 B*C 2 22.5833333 11.2916667 8.47 0.0051 A*B*C 2 18.5833333 9.2916667 6.97 0.0098 Contrast DF Contrast SS Mean Square F Value Pr > F b*c at a=1 2 40.66666667 20.33333333 15.25 0.0005 b*c at a=2 2 0.50000000 0.25000000 0.19 0.8314 c at a=1 & b=1 2 64.00000000 32.00000000 24.00 <.0001 c at a=1 & b=2 2 1.33333333 0.66666667 0.50 0.6186

In the first **contrast** statement, we are interested in the b*c
interaction at a=1. The b*c interaction has 2 degrees of freedom (
(2-1)*(3-1) = 2 ). To indicate this, we use a semicolon to separate the
two parts. Also, because we have included the two-way interaction, we also
need to include the three-way interaction. In the second **contrast **
statement, we are looking at the b*c interaction at a=2. Realistically,
we wouldn’t know to to include the third and fourth **contrast **statements until we
had run the first two and seen the results. To save space, we have
included these two **contrast **statements, which investigate **c** at a=1 and both
levels of **b**.

Let’s look a little closer at the coding of the variables on the **contrast
**statements. First, we need to
remember that the variable **a** has two levels, **b** has two levels, and
**c** has three
levels. The coding (which is effect coding) is for each cell produced
by the crossing of the categorical predictor variables. This is perhaps
best understood as the "differences of differences" approach. (For more
information, please see
Multiple
Regression: Testing and Interpreting Interactions by Leona S. Aiken and
Steven G. West).

proc glm data = threeway; class a b c; model y = a b c a*b a*c b*c a*b*c; contrast 'b*c at a=1' b*c 1 0 -1 -1 0 1 a*b*c 1 0 -1 -1 0 1 0 0 0 0 0 0, b*c 0 1 -1 0 -1 1 a*b*c 0 1 -1 0 -1 1 0 0 0 0 0 0; contrast 'b*c at a=2' b*c 1 0 -1 -1 0 1 a*b*c 0 0 0 0 0 0 1 0 -1 -1 0 1, b*c 0 1 -1 0 -1 1 a*b*c 0 0 0 0 0 0 0 1 -1 0 -1 1; contrast 'c at a=1 & b=1' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 1 0 -1 0 0 0 a*b*c 1 0 -1 0 0 0 0 0 0 0 0 0, c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 1 -1 0 0 0 a*b*c 0 1 -1 0 0 0 0 0 0 0 0 0; contrast 'c at a=1 & b=2' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 0 0 0 1 0 -1 a*b*c 0 0 0 1 0 -1 0 0 0 0 0 0, c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 0 0 0 1 -1 a*b*c 0 0 0 0 1 -1 0 0 0 0 0 0; run; quit;

## The first contrast statement

Let’s take the first line of the first **contrast **statement as an
example. We have the b*c interaction at a=1, and we are comparing c1 to
c3. In other words, c3 is our reference group. Picking c3 as our
reference group is somewhat arbitrary; we could have used c1 or c2. The
"differences of differences" approach means that we are going to take the
difference of c1 and c3 at b=1, and the difference of c1 and c3 at b=2, and then
take the difference of those two differences. In the table below, we have
six cells (because 2 levels of **b** times 3 levels of **c** equals
6). We have called the cells m_{subscript}, so that we can do
some symbolic math.

a=1

c1 | c2 | c3 | |

b=1 | m_{11} |
m_{12} |
m_{13} |

b=2 | m_{21} |
m_{22} |
m_{23} |

(m_{11} – m_{13}) – (m_{21} – m_{23})

(1 0 -1) – (1 0 -1) = 1 0 -1 -1 0 1

Notice that 1 0 -1 -1 0 1 are the first six entries in the first line of the first
**contrast **statement.

Now let’s look at the second part, the a*b*c interaction. The first six numbers are for a=1, and the second six are for a=2. Because we are only looking at a=1 in this analysis, all of the values for a=2 are 0. The values for a=1 are the same as those for the b*c interaction.

Here is another way of thinking about the first line of the first **contrast
**statement: ** **

contrast 'b*c at a=1' b*c 1 0 -1 -1 0 1 a*b*c 1 0 -1 -1 0 1 0 0 0 0 0 0;

Yellow: b=1, comparing c1 with c3

Orange: b=2, comparing c1 with c3

Green: a=1 and b=1, comparing c1 with c3

Blue: a=1 and b=2, comparing c1 with c3

Pink: a=2 and b=1, these are all 0s because we are looking only at a=1

Purple: a=2 and b=2, these are all 0s because we are looking only at a=1

The second line of the first **contrast **statement is very similar to the
first, except that it is for c2 versus c3. So, we have

(m_{12} – m_{13}) – (m_{22} – m_{23})

(0 1 -1) – (0 1 -1) = 0 1 -1 0 -1 1

contrast 'b*c at a=1' b*c 1 0 -1 -1 0 1 a*b*c 1 0 -1 -1 0 1 0 0 0 0 0 0; b*c 0 1 -1 0 -1 1 a*b*c 0 1 -1 0 -1 1 0 0 0 0 0 0

Yellow: b=1, comparing c2 with c3

Orange: b=2, comparing c2 with c3

Green: a=1 and b=1, comparing c2 with c3

Blue: a=1 and b=2, comparing c2 with c3

Pink: a=2 and b=1, these are all 0s because we are looking only at a=1

Purple: a=2 and b=2, these are all 0s because we are looking only at a=1

## The second **contrast **statement

The second **contrast **statement looks at the b*c interaction at a=2.
It is the same as the first, except in the
part for the a*b*c interaction. Here, the first six 0s are for a=1, which
we are not considering in this **contrast **statement. The same coding
used in the first **contrast **statement is simply shifted to the a=2 part of
the code.

## The third **contrast **statement

contrast 'c at a=1 & b=1' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 1 0 -1 0 0 0 a*b*c 1 0 -1 0 0 0 0 0 0 0 0 0; c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 1 -1 0 0 0 a*b*c 0 1 -1 0 0 0 0 0 0 0 0 0

By now, the coding for **c**, the first part of the **contrast **
statement, should be familiar. In this first line, we are comparing c1
with c3.

a=1

c1 | c2 | c3 | |

b=1 | m_{11} |
m_{12} |
m_{13} |

b=2 | m_{21} |
m_{22} |
m_{23} |

(m_{11} – m_{13}) – (m_{21} – m_{23})

(1 0 -1) – (1 0 -1) = 1 0 -1 -1 0 1

Red: comparing c1 with c3

Light blue: a=1, comparing c1 and c3

Dark green: a=2, these are all 0 because we are looking at a=1

Yellow: b=1, comparing c1 with c3

Orange: b=2, these are all 0 because we are looking at b=1

Light green: a=1, b=1, comparing c1 with c3

Dark blue: a=1, b=2, these are all 0 because we are looking at b=1

Pink: a=2, b=1, these are all 0 because we are looking at a=1

Purple: a=2, b=2, these are all 0 because we are looking at a=1 and b=1

The second line of the third **contrast **statement is very similar to the first
line, except that it compares c2 to c3.

contrast 'c at a=1 & b=1' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 1 0 -1 0 0 0 a*b*c 1 0 -1 0 0 0 0 0 0 0 0 0; c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 1 -1 0 0 0 a*b*c 0 1 -1 0 0 0 0 0 0 0 0 0

Light blue: a=1, comparing c2 and c3

Dark green: a=2, these are all 0 because we are looking at a=1

Yellow: b=1, comparing c2 with c3

Orange: b=2, these are all 0 because we are looking at b=1

Light green: a=1, b=1, comparing c2 with c3

Dark blue: a=1, b=2, these are all 0 because we are looking at b=1

Pink: a=2, b=1, these are all 0 because we are looking at a=1

Purple: a=2, b=2, these are all 0 because we are looking at a=1 and b=1

## The fourth **contrast **statement

The fourth **contrast **statement is the same as the third, except we are
now looking at b=2. Hence, we have 0s for the b=1 part of the code and the
comparisons of the different levels of **c** in the b=2 part of the code.

contrast 'c at a=1 & b=2' c 1 0 -1 a*c 1 0 -1 0 0 0 b*c 0 0 0 1 0 -1 a*b*c 0 0 0 1 0 -1 0 0 0 0 0 0; c 0 1 -1 a*c 0 1 -1 0 0 0 b*c 0 0 0 0 1 -1 a*b*c 0 0 0 0 1 -1 0 0 0 0 0 0.

## Correcting for multiple tests

We should note that although a p-value is given for each of the four F-tests, it is not corrected for the multiple tests. There are at least four different methods of determining the critical value of tests of simple main-effects. There is a method related to Dunn’s multiple comparisons, a method attributed to Marascuilo and Levin, a method called the simultaneous test procedure (very conservative and related to the Scheffé post-hoc test) and a per family error rate method. We will demonstrate the per family error rate method, but you should look up the other methods in a good ANOVA book, such as Kirk (1995), to decide which approach is best for your situation.

Let’s take the first two tests, comparing b*c at a=1 and at a=2 as an
example. The values for the F-tests were 15.25 and .188, respectively.
We divide our alpha level, 0.05, by 2 because we are doing two tests of simple
main-effects, so our new value of alpha is .025. The **finv** function
requires us to provide 1 – alpha, so we have 1 – .025 = .975.

data _null_; x = finv(.975, 2, 12); put "The critical value per family error rate is " x; run;The critical value per family error rate is 5.0958671658

As you can see, the critical value is approximately 5.1. This indicates that the b*c interaction is statistically significant at a=1 but not at a=2.

## References

Kirk, Roger E. (1995) *Experimental Design: Procedures for the Behavioral Sciences,
Third Edition*. Monterey, California: Brooks/Cole Publishing.

Aiken, Leona S., and West, Stephen G. (1996) *Multiple Regression:
Testing and Interpreting Interactions*. Thousand Oaks, California:
Sage Publishing.