Skip to main content

5.4) Fitting A Linear Model (cont'd)


The plot() function applied to an lm object produces four diagnostic plots, each with a red trend line. Taken together, these provide a good assessment of the main numerical assumptions of general linear models.

  1. First, there’s a plot of residuals against predicted values. This can indicate non-linearities in the residuals; if the red line is continuously curved, it should be possible to find a transformation of one or more predictors to linearise this.
  2. The second graph shows a normal Q-Q plot, to indicate how well the residuals follow a Gaussian distribution. If they don’t follow the straight line in this plot, the response variable may need to be transformed.
  3. Then there’s a plot of the positive square root of the residuals against the predicted values, to help assess homoskedasticity. If the red line is roughly horizontal, there’s no problem here; otherwise, the response may need transforming.
  4. The final plot indicates leverage by plotting residuals against their distance from the average value of the predictor variables. If curving dotted lines labeled "Cook’s distance" appear in the figure, check which data points lie outside of these and whether you trust them to have the high influence they have.