• Minitab can represent both 1D scatterplots and boxplots.
  • Minitab can generate useful summary and diagnostic data for linear regression without using a separate package.
  • Minitab graphical output has relatively few options for customisation.

1 Setup

To work through these examples, you will need the following on your own computer:

  1. The datasets (see this link for a description)
  2. Minitab (you have a licence for this via the university)

You will need to be either on campus, or connected to the university VPN, to use Minitab due to licensing restrictions.

1.1 Boxplot and 1D scatterplot

Our goal is to show how the measured guinea pig tooth growth varies by combination of supplement and supplement dosage. We could approach this in any of several ways, but here we want to treat each supplement as a category or factor, and each dosage as a category or factor. We’d like to see the distribution of measured tooth lengths conditioned on these explanatory variables.

What we’re looking for is a visual representation of the variation in the dataset, for each combination of supplement and dosage. A 1D scatterplot is a good way to visualise the raw data, and a boxplot/box-and-whisker plot is a good way to represent summary statistics.

Click on Open \(\rightarrow\) Open Worksheet, select the toothgrowth.csv data file, and click Open. This will open up the ToothGrowth dataset in Minitab.

Open the `ToothGrowth` dataset

Figure 1.1: Open the ToothGrowth dataset

Open the `ToothGrowth` dataset

Figure 1.2: Open the ToothGrowth dataset

1.1.1 Boxplot

Click on Graphs \(\rightarrow\) Boxplot \(\rightarrow\) Single Y Variable \(\rightarrow\) With Groups.

Select graph type

Figure 1.3: Select graph type

Select graph type

Figure 1.4: Select graph type

Then choose the appropriate \(x\)- (group) and \(y\)-variables. We want to see the \(y\)-variable (dependent variable) len, conditioned on the groups: VC and OJ (supp) split by their dosage dose, and we select accordingly.

Select boxplot groups

Figure 1.5: Select boxplot groups

This gives us a similar plot to that we obtained with Excel, showing the same trends with dosage for both supplements, and the same outlier information.

Create the boxplot

Figure 1.6: Create the boxplot

  • The Minitab plot labels the \(x\)- and \(y\)-axes clearly and correctly.
  • The Minitab plot does not add distracting colour
  • By default the Minitab plot does not show the mean for each group

There are relatively few options to improve the graph, visually, but we can retitle the plot by double-clicking on it, and there are options for adding a little more explanatory data.

Annotate and retitle the boxplot

Figure 1.7: Annotate and retitle the boxplot

Annotate and retitle the boxplot

Figure 1.8: Annotate and retitle the boxplot

1.1.2 1D Scatterplot

Boxplots are good representations of a data summary, showing quartiles and interquartile ranges (IQRs), but a 1D scatterplot is (for small to moderate data sizes) preferable for showing the actual distribution of raw data.

Click on Graphs \(\rightarrow\) Individual Value Plot... \(\rightarrow\) Single Y Variable \(\rightarrow\) With Groups.

Select graph type

Figure 1.9: Select graph type

Select graph type

Figure 1.10: Select graph type

Then choose the appropriate \(x\)- (group) and \(y\)-variables. We want to see the \(y\)-variable (dependent variable) len, conditioned on the groups: VC and OJ (supp) split by their dosage dose, and we select accordingly.

Select scatterplot groups

Figure 1.11: Select scatterplot groups

This gives us a 1D scatterplot showing the same trends with dosage for both supplements, but no outlier information.

Create the 1D scatterplot

Figure 1.12: Create the 1D scatterplot

We can change the title of this graph as we saw before for the boxplot, by double-clicking on it, and typing in a new title.

1.2 Linear regression

The Prestige dataset described in the introduction notebook represents a set of occupations - one per row (observations) - with variables describing properties of each occupation, such as percentage of women, the “prestige” of the occupation, and the average number of years in education of a person in that occupation.

Using Minitab we will model the relationship between prestige and years in education, using a linear relationship. We’d like to overlay a line describing the relationship, with some statistical information about goodness of fit and the inferred parameters of the model (gradient and intercept).

Click on Open \(\rightarrow\) Open Worksheet, select the prestige.csv data file, and click Open. This will open up the Prestige dataset in Minitab.

Open the `Prestige` dataset

Figure 1.13: Open the Prestige dataset

Open the `Prestige` dataset

Figure 1.14: Open the Prestige dataset

Click on Regression \(\rightarrow\) Simple Regression... \(\rightarrow\) Single Y Variable \(\rightarrow\) With Groups, and choose the appropriate variables. We want to regress the response variable prestige against the predictor education

Select regression type

Figure 1.15: Select regression type

Select regression type

Figure 1.16: Select regression type

We also want to get the extra statistical information about confidence intervals and diagnostic plots that we saw in Excel. To do so, select the appropriate options in the Options and Graphs tabs, then click OK

Select regression type

Figure 1.17: Select regression type

Select regression type

Figure 1.18: Select regression type

Minitab, unlike Excel, makes it easy to do multiple regression. This enables simultaneous regression of a response variable onto multiple potential explanatory variables, taking into account the combined influence of the explanatory variables. The topic is beyond the scope of this workshop, but this dataset is well-suited to that kind of analysis.

Clicking OK presents an analysis of variance and other useful information about the regression result. Scrolling down in the top panel will show the regression plot and diagnostic plots. There are few visual elements that can be modified.

Select boxplot groups

Figure 1.19: Select boxplot groups

Select boxplot groups

Figure 1.20: Select boxplot groups

Minitab and Excel present different summaries and diagnostic information.

  • Excel will present confidence intervals for fitted parameters, if asked; Minitab does not (however, these can be calculated from the information provided)
  • Minitab provides both a confidence interval for the fit (95% CI) and a probability interval for new values (95% PI) on the main regression graph.
  • Minitab provides more diagnostic plots
  • Minitab indicates potentially problematic datapoints, Excel does not