1 Introduction
1.1 Working with phylogenetic trees
When you use a software tool to build, or you otherwise acquire, phylogenetic tree data it will be in a (plain text) file. This file will be in one of a number of more or less cryptic file formats, such as:
newick
, file ending.new
,.nwk
,.newick
NEXUS
, file ending.nex
,.nexus
phylip
, file ending.phy
,.phylip
These files can be cryptic to read, as a human. For example, the first newick
format file we will work with has these contents:
(((((((A:4,B:4):6,C:5):8,D:6):3,E:21):10,((F:4,G:12):14,H:8):13):13,((I:5,J:2):30,(K:11,L:11):2):17):4,M:56);
It is difficult, as a human, to read a file like this and understand the structure of the tree it describes, so we almost always use computational tools to open these files and visualise the trees they represent.
1.1.1 Phylogenetic tree visualisation software
A number of software packages are available to visualise, edit, and refine phylogenetic tree files for reports and publications. Some of these can be downloaded and run on your own computer, such as:
figtree
this is the one I use most often - LPdendroscope
but the interactive Tree of Life (iTol) service allows you to visualise and edit trees directly in the browser. If you register and sign in, you can save your trees for future use.
In this workshop we will be using the online iToL
service to visualise and interpret trees.
Traditionally, phylogenetic trees have been visualised using independent software tools like these, but the R
software ecosystem is a very powerful tool for computational biology and bioinformatics work, encompassing sequence analysis and genomics, transcriptomic and evolutionary analyses, with excellent visualisation capabilities - including for phylogenetic trees.
As R
is also a programming language, large parts of the analysis can be automated and made reproducible, which is an advantage over independent tools, and a factor in the growing popularity of this approach. Basic competence and skills in R
are in high demand in academia and industry.
You can produce equivalent trees to those you will produce in these exercises using standalone tools like figtree
and dendroscope
. Some tree visualisations can only be achieved in specialised packages, or with tools like ggtree
.
ggtree
is an R
package that extends the ggplot2
data visualisation package to visualise and annotate phylogenetic trees (or any other tree-like structure!).