12  PubMLST Classification

12.1 Multi-Locus Sequence Typing (MLST)

Multi-Locus Sequence Typing (MLST) is a widely-used method for bacterial identification. It is typically more precise and has more resolving power than 16S sequence analysis, but less precise than whole-genome sequence analysis (Maiden et al. (2013)).

MLST works by defining marker sequences for a taxon (Figure 12.1). These are typically well-conserved (“housekeeping”) genes which very relatively little between organisms in the taxon, but enough to allow discrimination between them. The number of markers varies, but is usually somewhere around seven.

Each marker sequence has many variants (different sequences) within the taxon, and these are known as alleles. Each marker allele is given a unique number (starting at 1 and counting upwards) - its allele number. A single organism’s sequence type (ST) is determined by the list of allele numbers that it contains. Organisms with the same sequence type are considered to be part of the same group.

12.2 MLST Classification with Galaxy

The PubMLST.org website hosts a collection of open-access, curated databases that integrate population sequence data with provenance and phenotype information for over 130 different microbial species and genera. These databases are curated and maintained by volunteers and made available freely for use by anyone. They are available in Galaxy.

The MLST software is made available in Galaxy for querying bacterial sequences against the pubMLST databases. To do this:

  1. Navigate to the MLST tool in the Tools sidebar
  2. Select the ERR531380 assembly as the input_files
  3. Click on Run Tool
  4. Click on the Eye icon to see the result
Video: Use Galaxy’s MLST tool to classify your isolate
QUESTION

What is the sequence type (ST) of your isolate as reported by Galaxy MLST?

paeruginosa

395

6, 5, 1, 1, 1, 12, 1

Check the Help section of the MLST tool for a description of the output format.

12.3 Classification at pubMLST

The PubMLST.org website also provides its own search and classification tools. Some of these are genus or species specific, running queries against databases focused on classification of a particular group of organisms, but they also provide a much braoder classification scheme with a much larger number of markers that you may have met in a BM329 workshop.

Here, you will use PubMLST’s Pseudomonas aeruginosa-specific database to identify your assembled isolate.

  1. In a new tab of your browser open the PubMLST website
  2. Click on Organisms
  3. Navigate to Pseudomonas aeruginosa and click on Typing
  4. Under Query a sequence click on Single sequence
  5. Select the ERR531380 assembly as input
  6. Click on SUBMIT
Warning

The pubMLST classification may take a couple of minutes.

Video: Use PubMLST to classify your isolate
QUESTION
  1. Does PubMLST assign the same sequence type (ST) to your isolate as Galaxy’s MLST tool did?

12.4 Next Steps

Now you have assembled and classified your isolate, you can use some additional data prepared by your helpful colleague to reconstruct a phylogenetic tree for isolates obtained from the burns ward.