In a completely randomized design, there is only one primary factor under consideration in the experiment. The test subjects are assigned to treatment levels of the primary factor at random. ANOVA for One-way Layout in our lecture.
Item1 | Item2 | Item3 |
---|---|---|
22 | 52 | 16 |
42 | 33 | 24 |
44 | 8 | 19 |
52 | 47 | 18 |
45 | 43 | 34 |
37 | 32 | 39 |
# Concatenate the data rows into a single vector r.
r <- c(22, 52, 16, 42, 33, 24, 44, 8, 19, 52, 47, 18, 45, 43, 34, 37, 32, 39)
# Assign new variables for the treatment levels and number of observations
f <- c("Item1", "Item2", "Item3")
k <- 3
n <- 6
# Create a vector of treatment factors that corresponds to each element of r
# with the gl function
tm <- gl(k, 1, n * k, factor(f))
tm
## [1] Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2
## [12] Item3 Item1 Item2 Item3 Item1 Item2 Item3
## Levels: Item1 Item2 Item3
# Make another temporary dataframe, 'df', with data.frame function, renaming the
# response and treatment factors as 'Sales' and 'Item' respectively.
df <- data.frame(Sales = r, Item = tm)
head(df)
## Sales Item
## 1 22 Item1
## 2 52 Item2
## 3 16 Item3
## 4 42 Item1
## 5 33 Item2
## 6 24 Item3
# apply the function aov to a formula that discribes the response 'Sales' by the
# treatment factor 'Item'.
av <- aov(Sales ~ Item, data = df)
# print out the ANOVA table with the summary function.
summary(av)
## Df Sum Sq Mean Sq F value Pr(>F)
## Item 2 745.4 372.7 2.541 0.112
## Residuals 15 2200.2 146.7
In a randomized block design, there is only one primary factor under consideration in the experiment. Similar test subjects are grouped into blocks. Each block is tested against all treatment levels of the primary factor at random order. This is intended to eliminate possible influence by other extraneous factors. ANOVA for Randomized Block Design in our lecture.
Item1 | Item2 | Item3 |
---|---|---|
31 | 27 | 24 |
31 | 28 | 31 |
45 | 29 | 46 |
21 | 18 | 48 |
42 | 36 | 46 |
32 | 17 | 40 |
# Concatenate the data rows into a single vector r2.
r2 <- c(31, 27, 24, 31, 28, 31, 45, 29, 46, 21, 18, 48, 42, 36, 46, 32, 17, 40)
# Assign new variables for the treatment levels and number of control blocks
f <- c("Item1", "Item2", "Item3")
k <- 3
n <- 6
# Create a vector of treatment factors that corresponds to the each element in
# r2 with the gl function
tm <- gl(k, 1, n * k, factor(f))
tm
## [1] Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2
## [12] Item3 Item1 Item2 Item3 Item1 Item2 Item3
## Levels: Item1 Item2 Item3
# Similarly, create a vector of blocking factors for each element in the response data r2.
blk <- gl(n, k, k*n)
blk
## [1] 1 1 1 2 2 2 3 3 3 4 4 4 5 5 5 6 6 6
## Levels: 1 2 3 4 5 6
# Make another temporary dataframe, 'df', with data.frame function, renaming the
# response, treatment factors and block control as 'Sales', 'Item' and 'Block'
# respectively.
df <- data.frame(Sales = r2, Item = tm, Block = blk)
head(df)
## Sales Item Block
## 1 31 Item1 1
## 2 27 Item2 1
## 3 24 Item3 1
## 4 31 Item1 2
## 5 28 Item2 2
## 6 31 Item3 2
# Apply the function aov to a formula that describes the response 'Sales' by
# both treatment factor 'Item' and the block control 'Block'. Print out the
# ANOVA table with summary function.
av2 <- aov(Sales ~ Item + Block, data = df)
summary(av2)
## Df Sum Sq Mean Sq F value Pr(>F)
## Item 2 538.8 269.39 4.959 0.0319 *
## Block 5 559.8 111.96 2.061 0.1547
## Residuals 10 543.2 54.32
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In a Two-way Layout Design, there are more than one factors under consideration in the experiment. The test subjects are assigned to treatment levels of every factor combinations at random. In this tutorial, we will consider only a special case, where all groups occur and where there is the same number of observations in each group.
Item1 | Item2 | Item3 |
---|---|---|
25 | 39 | 36 |
36 | 42 | 24 |
31 | 39 | 28 |
26 | 35 | 29 |
51 | 43 | 42 |
47 | 39 | 36 |
47 | 53 | 32 |
52 | 46 | 33 |
# Concatenate the data rows into a single vector r3.
r3 <- c(25, 39, 36, 36, 42, 24, 31, 39, 28, 26, 35, 29, 51, 43, 42, 47, 39, 36, 47, 53, 32, 52, 46, 33)
# Assign new variables for the treatment levels and number of observations.
f1 <- c("Item1", "Item2", "Item3")
f2 <- c("East", "West")
k1 <- length(f1)
k2 <- length(f2)
n <- 4
# Create a vector that corresponds to the first treatment level of the response
# data r3 in step 2 element-by-element with the gl function.
tm1 <- gl(k1, 1, n * k1 * k2, factor(f1))
tm1
## [1] Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2
## [12] Item3 Item1 Item2 Item3 Item1 Item2 Item3 Item1 Item2 Item3 Item1
## [23] Item2 Item3
## Levels: Item1 Item2 Item3
# Similarly, create a vector that corresponds to the second treatment of the
# response data r3 in step 3.
tm2 <- gl(k2, n * k1, n * k1 * k2, factor(f2))
tm2
## [1] East East East East East East East East East East East East West West
## [15] West West West West West West West West West West
## Levels: East West
# Apply the function aov to a formula that describes the response r3 by the two
# treatment factors tm1 and tm2 with interaction. Print out the ANOVA table with
# summary function.
av3 <- aov(r3 ~ tm1 * tm2)
summary(av3)
## Df Sum Sq Mean Sq F value Pr(>F)
## tm1 2 385.1 192.5 9.554 0.00149 **
## tm2 1 715.0 715.0 35.481 1.23e-05 ***
## tm1:tm2 2 234.1 117.0 5.808 0.01132 *
## Residuals 18 362.8 20.2
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Using ‘coagulation’ dataset in ‘faraway’ packages, answer the following questions
library(faraway)
data(coagulation)
# a
ANOVA <- aov(coag~diet,data = coagulation)
summary(ANOVA)
## Df Sum Sq Mean Sq F value Pr(>F)
## diet 3 228 76.0 13.57 4.66e-05 ***
## Residuals 20 112 5.6
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# 1-b
df.fl <- summary(ANOVA)[[1]][1,1]
df.R <- summary(ANOVA)[[1]][2,1]
MSE <- summary(ANOVA)[[1]][2,3]
Alpha <- 0.05
t.value <- qt(1 - Alpha/2,df.R,lower.tail = TRUE)
factors <- c('A','B','C','D')
rowID <- c('B-A','C-A','D-A','C-B','D-B','D-C')
colID <- c('diff','lwr','upr')
ttable <- matrix(nrow = 6,ncol = 3,dimnames = list(rowID,colID))
temp = 0
for (i in 1:3)
{
for (j in i:4)
{
if (i != j)
{
temp <- temp + 1
ttable[temp,1] = mean(coagulation$coag[which(coagulation$diet == toString(factors[j]))]) -
mean(coagulation$coag[which(coagulation$diet == toString(factors[i]))])
ttable[temp,2] = ttable[temp,1] -
t.value*sqrt(MSE*(1/length(which(coagulation$diet == toString(factors[j]))) +
1/length(which(coagulation$diet == toString(factors[i])))))
ttable[temp,3] = ttable[temp,1] +
t.value*sqrt(MSE*(1/length(which(coagulation$diet == toString(factors[j]))) +
1/length(which(coagulation$diet == toString(factors[i])))))
}
}
}
# 1-c
TukeyHSD(ANOVA, conf.level = 0.95)
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = coag ~ diet, data = coagulation)
##
## $diet
## diff lwr upr p adj
## B-A 5 0.7245544 9.275446 0.0183283
## C-A 7 2.7245544 11.275446 0.0009577
## D-A 0 -4.0560438 4.056044 1.0000000
## C-B 2 -1.8240748 5.824075 0.4766005
## D-B -5 -8.5770944 -1.422906 0.0044114
## D-C -7 -10.5770944 -3.422906 0.0001268
# 1-d
CI.table <- matrix(nrow = 6,ncol = 1,dimnames = list(rowID,c('Confidence interval')))
CI.table.re <- matrix(nrow = 6,ncol = 1,dimnames = list(rowID,c('Confidence interval')))
for (i in 1:6)
{
CI.table[i,1] <- TukeyHSD(ANOVA,conf.level = 0.95)$diet[i,3] -
TukeyHSD(ANOVA,conf.level = 0.95)$diet[i,2]
CI.table.re[i,1] <- 0.75*CI.table[i,1]
}
q.value <- qtukey(1 - Alpha,df.fl,df.R)
samsize.table <- matrix(nrow = 6,ncol = 1 ,dimnames = list(rowID,c('Sample size')))
for (i in 1:6)
{
samsize.table[i,1] <- (64*sqrt(MSE)*q.value^2)/(9*CI.table.re[i,1]^2)
}
Using ‘rats’ dataset in ‘faraway’ packages, answer the following questions
# a
table(rats$poison, rats$treat)
##
## A B C D
## I 4 4 4 4
## II 4 4 4 4
## III 4 4 4 4
ratAnova = aov(rats$time~rats$poison*rats$treat, rats)
summary(ratAnova)
## Df Sum Sq Mean Sq F value Pr(>F)
## rats$poison 2 1.0330 0.5165 23.222 3.33e-07 ***
## rats$treat 3 0.9212 0.3071 13.806 3.78e-06 ***
## rats$poison:rats$treat 6 0.2501 0.0417 1.874 0.112
## Residuals 36 0.8007 0.0222
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# b
TukeyHSD(ratAnova, which = "rats$poison")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = rats$time ~ rats$poison * rats$treat, data = rats)
##
## $`rats$poison`
## diff lwr upr p adj
## II-I -0.073125 -0.2020091 0.05575913 0.3583151
## III-I -0.341250 -0.4701341 -0.21236587 0.0000005
## III-II -0.268125 -0.3970091 -0.13924087 0.0000339
TukeyHSD(ratAnova, which = "rats$treat")
## Tukey multiple comparisons of means
## 95% family-wise confidence level
##
## Fit: aov(formula = rats$time ~ rats$poison * rats$treat, data = rats)
##
## $`rats$treat`
## diff lwr upr p adj
## B-A 0.36250000 0.19852116 0.52647884 0.0000047
## C-A 0.07833333 -0.08564550 0.24231217 0.5772283
## D-A 0.22000000 0.05602116 0.38397884 0.0048556
## C-B -0.28416667 -0.44814550 -0.12018783 0.0002333
## D-B -0.14250000 -0.30647884 0.02147884 0.1077087
## D-C 0.14166667 -0.02231217 0.30564550 0.1107678
# c
interaction.plot(rats$poison,rats$treat,rats$time, col = c("green", "red", "yellow", "cyan"))
interaction.plot(rats$treat,rats$poison,rats$time)