Skip to main content

5.1) A Simple Linear Model


For statistical modelling, R can be extremely simple. Using the framework of general linear models, there’s only one function you need to perform ANOVA, ANCOVA and linear regression. And it has a very short name... lm() means "linear model", and this function always takes a class of object called a formula. A formula names a response variable on the left-hand side of a tilde: ~ and one or more predictors on the right-hand side.

Let’s try the example given in the lm() help file.

ctl <- c(4.17,5.58,5.18,6.11,4.50,4.61,5.17,4.53,5.33,5.14)
trt <- c(4.81,4.17,4.41,3.59,5.87,3.83,6.03,4.89,4.32,4.69)
group <- gl(2, 10, 20, labels = c("Ctl","Trt"))
weight <- c(ctl, trt)
lm.D9 <- lm(weight ~ group)

The model itself comes in the last line, and it has the simplest possible formula: basically Y ~ X. The previous code creates the weight (response) vector by concatenating two other vectors, while the group (predictor) vector is made up using the function gl() to generate a factor with levels "Ctl" and "Trt".