Skip to main content

Pre-requisites


This section contains background information to help prepare you for learning R. Just skip to any of the sections below that are relevant to you.

If you’ve used another statistical analysis package…

If you’re familiar with running another statistics package using menus, then doing stats in R will feel quite different.

  • You won’t normally interact with your data through a table on the screen. Rather, you’ll save your spreadsheet as a text file to read into R, and normally refer to your data using column names. You’ll develop new ways of imagining your data!
  • You must know what kind of analysis you want to do. There are no problem-oriented menus or prompts.
  • You must know what outputs you require. Many of R’s statistical functions save a large amount of output and wait for you to extract what you want to know.
  • You must know how to check the validity of the statistical methods you use. Most R statistical functions make it easy to extract the common diagnostics, but you still need to know what to ask for.
  • There’s no “Undo” command! For this reason, we run everything from a script file, so that mistakes can easily be overwritten by re-doing what we intended to do. Also, we can’t simply edit graphs or other outputs; again, it’s easy to re-do things when we want to change them.

If that all sounds overwhelming, fear not! This guide takes you through the basic steps to perform linear model analyses, and it also shows how to find help on any statistical method you may ever need.
If you happen to have used the language S or the programme S-PLUS, then R should feel strangely familiar, because it was developed from S and remains very similar.

There are interface programs you can use to help organize your R sessions and check your code (keeping track of brackets, etc), giving additional GUI functions such as some more menus and windows. Many of these are free to download from the Web; R-commander, RStudio and TinnR are recommended (available from http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/, http://www.rstudio.com/ide/ and http://www.sciviews.org/Tinn-R/ respectively). But these notes assume you’re just using R, pure and simple.

If you know another programming language…

R is a high-level language, where lots of functions are readily available in the base distribution, and you don’t normally have to make decisions about memory usage.
Here are some other key features:

  • An interpreted language: R code doesn’t need compiling.
  • A language and environment: The GUI for R provides a basic colour-coded console and additional windows for scripts, for graphical outputs and (if necessary) a very basic data editor. The term "environment" is intended to characterize R as "a planned and coherent system, rather than an incremental accretion of specific and inflexible tools, as is frequently the case with other data analysis software".
  • Object oriented: Everything referred to in the R language can be considered as an object with a class that determines its properties. Some functions have different methods for different classes. New classes can easily be created by the user.
  • Capabilities: Matrix computations can be performed. The only limits to data volumes are those of the computer’s memory allocation. Functions are available for advanced management of memory allocation (e.g. in the add-on package “ff”).
  • Notable syntax: The preferred symbol for assigning a value is an arrow “<-” , not an equals sign. The symbol to prefix comments is #.

From the R homepage: “R… is designed around a true computer language, and it allows users to add additional functionality by defining new functions. Much of the system is itself written in the R language, which makes it easy for users to follow the algorithmic choices made. For computationally-intensive tasks, C, C++ and Fortran code can be linked and called at run time. Advanced users can write C code to manipulate R objects directly.”

If you’re using Linux or another UNIX-like operating system...

The rest of this guide assumes you are using Windows, MacOS or a similar operating system. However, as a GNU project, R is completely at home on UNIX-like systems. There are a few differences in functionality for data input and output, and some of the functions given to read or write files, in Session 2 below, may need adjusting.
The official R manuals are actually oriented towards UNIX users, so if you need any help just go to http://cran.r-project.org/doc/manuals/r-release/R-intro.html (main help) or http://cran.r-project.org/doc/manuals/r-release/R-data.html (data import).