Minitab
can represent both 1D scatterplots and boxplots.Minitab
can generate useful summary and diagnostic data for linear regression without using a separate package.Minitab
graphical output has relatively few options for customisation.To work through these examples, you will need the following on your own computer:
Minitab
(you have a licence for this via the university)You will need to be either on campus, or connected to the university VPN, to use Minitab
due to licensing restrictions.
Our goal is to show how the measured guinea pig tooth growth varies by combination of supplement and supplement dosage. We could approach this in any of several ways, but here we want to treat each supplement as a category or factor, and each dosage as a category or factor. We’d like to see the distribution of measured tooth lengths conditioned on these explanatory variables.
What we’re looking for is a visual representation of the variation in the dataset, for each combination of supplement and dosage. A 1D scatterplot is a good way to visualise the raw data, and a boxplot/box-and-whisker plot is a good way to represent summary statistics.
Click on Open
\(\rightarrow\) Open Worksheet
, select the toothgrowth.csv
data file, and click Open
. This will open up the ToothGrowth
dataset in Minitab
.
Click on Graphs
\(\rightarrow\) Boxplot
\(\rightarrow\) Single Y Variable
\(\rightarrow\) With Groups
.
Then choose the appropriate \(x\)- (group) and \(y\)-variables. We want to see the \(y\)-variable (dependent variable) len
, conditioned on the groups: VC
and OJ
(supp
) split by their dosage dose
, and we select accordingly.
This gives us a similar plot to that we obtained with Excel
, showing the same trends with dosage for both supplements, and the same outlier information.
Minitab
plot labels the \(x\)- and \(y\)-axes clearly and correctly.Minitab
plot does not add distracting colourMinitab
plot does not show the mean for each groupThere are relatively few options to improve the graph, visually, but we can retitle the plot by double-clicking on it, and there are options for adding a little more explanatory data.
Boxplots are good representations of a data summary, showing quartiles and interquartile ranges (IQRs), but a 1D scatterplot is (for small to moderate data sizes) preferable for showing the actual distribution of raw data.
Click on Graphs
\(\rightarrow\) Individual Value Plot...
\(\rightarrow\) Single Y Variable
\(\rightarrow\) With Groups
.
Then choose the appropriate \(x\)- (group) and \(y\)-variables. We want to see the \(y\)-variable (dependent variable) len
, conditioned on the groups: VC
and OJ
(supp
) split by their dosage dose
, and we select accordingly.
This gives us a 1D scatterplot showing the same trends with dosage for both supplements, but no outlier information.
We can change the title of this graph as we saw before for the boxplot, by double-clicking on it, and typing in a new title.
The Prestige
dataset described in the introduction notebook represents a set of occupations - one per row (observations) - with variables describing properties of each occupation, such as percentage of women, the “prestige” of the occupation, and the average number of years in education of a person in that occupation.
Using Minitab
we will model the relationship between prestige and years in education, using a linear relationship. We’d like to overlay a line describing the relationship, with some statistical information about goodness of fit and the inferred parameters of the model (gradient and intercept).
Click on Open
\(\rightarrow\) Open Worksheet
, select the prestige.csv
data file, and click Open
. This will open up the Prestige
dataset in Minitab
.
Click on Regression
\(\rightarrow\) Simple Regression...
\(\rightarrow\) Single Y Variable
\(\rightarrow\) With Groups
, and choose the appropriate variables. We want to regress the response variable prestige
against the predictor education
We also want to get the extra statistical information about confidence intervals and diagnostic plots that we saw in Excel
. To do so, select the appropriate options in the Options
and Graphs
tabs, then click OK
Minitab
, unlike Excel
, makes it easy to do multiple regression. This enables simultaneous regression of a response variable onto multiple potential explanatory variables, taking into account the combined influence of the explanatory variables. The topic is beyond the scope of this workshop, but this dataset is well-suited to that kind of analysis.
Clicking OK
presents an analysis of variance and other useful information about the regression result. Scrolling down in the top panel will show the regression plot and diagnostic plots. There are few visual elements that can be modified.
Minitab
and Excel
present different summaries and diagnostic information.
Excel
will present confidence intervals for fitted parameters, if asked; Minitab
does not (however, these can be calculated from the information provided)Minitab
provides both a confidence interval for the fit (95% CI
) and a probability interval for new values (95% PI
) on the main regression graph.Minitab
provides more diagnostic plotsMinitab
indicates potentially problematic datapoints, Excel
does not