Welcome to R and RStudio

Leighton Pritchard

University of Strathclyde

2025-11-04

1. Summary and Setup

Software and files

  • University PC: use RStudio
  • Own laptop: use RStudio

Important

Figure 1: R software download
Figure 2: RStudio software download

Learning Objectives

  • Fundamentals of R and RStudio
  • Fundamentals of programming (in R)
  • Data management with the tidyverse
  • Publication-quality data visualisation with ggplot2
  • Reporting with RMarkdown

2. Introduction to R

What is R?

R is

  • a programming language
  • the software that interprets/runs programs written in the R language

Why use R?

  • free (though commercial support can be bought)
  • widely used
    • sciences, humanities, engineering, statistics, etc.
  • has many excellent specialised packages for data analysis and visualisation
  • international, friendly user community

What is RStudio?

RStudio is an integrated development environment (IDE)

Figure 3: macOS IDE
Figure 4: Windows IDE

Interactive Demo

Please start RStudio on your own machine

Variables

Variables are like named boxes

  • An item (object) of data goes in the box (which is called Name)
  • When we refer to the box (variable) by its name, we usually mean what’s in the box
Figure 5: A variable has a name and contains a value

Interactive Demo

Naming Variables

Variable names are documentation

best practices

  • descriptive, but not too long
  • letters, numbers, underscores, and periods ([a-zA-z0-9_.])
  • cannot contain whitespace or start with a number (x2 is allowed, 2x is not)
  • case sensitive (Weight is not the same as weight)
  • do not reuse names of built-in functions
  • Consistent style:
    • lower_snake, UPPER_SNAKE, lowerCamelCase, UpperCamelCase
Figure 6: Variable names should reflect variable content

Functions

What are functions?

  • Functions (log(), sin() etc.) ≈ “canned script”
    • automate complicated tasks
    • make code more readable and reusable
  • Some functions are built-in (in base packages, e.g. sqrt(), lm(), plot())
  • Groups of related functions can be imported as libraries

Note

  • Functions usually take arguments (input)
  • Functions often return values (output)

Getting Help in R

INTERACTIVE DEMO

args(fname)            # arguments for fname
?fname                 # help page for fname
help(fname)            # help page for fname
??fname                # any mention of fname
help.search("text")    # any mention of "text"
vignette(fname)        # worked examples for fname
vignette()             # show all available vignettes

Challenge (1min)

What is the value of each variable after running the following R statements?

mass <- 47.5
age <- 122
mass <- mass * 2.3
age <- age - 20
  • mass = 47.5, age = 102
  • mass = 109.25, age = 102
  • mass = 47.5, age = 122
  • mass = 109.25, age = 122

3. Project Management in R

How Projects Tend To Grow

Figure 7: How projects often proceed…

Good Practice

THERE IS NO ONE TRUE WAY (only principles)

  • Use a single working directory per project/analysis
    • easier to move, share, and find files
    • use relative paths to locate files
  • Treat raw data as read-only
    • keep in a separate subfolder (data?)
  • Clean data ready for work programmatically
    • keep cleaned/modified data in separate folder (clean_data?)
  • Consider output generated by analysis to be disposable
    • can be regenerated by running analysis/code

Example Directory Structure

Figure 8: An example working directory structure

Guidelines

  • The project determines the structure; structure is a means to an end
  • Add a README.txt file

Project Management in RStudio

RStudio can help you manage your projects

  • R Project concept - files and subdirectory structure
  • integration with version control (e.g. git)
  • switching between multiple projects within RStudio
  • stores project history

INTERACTIVE DEMO

Let’s create a project in RStudio