12 Find Sequence Variants
snippy
is a sequence variant finding pipeline, designed for haploid genomes such as prokaryotes and viruses.
As with most kinds of bioinformatics data, there is a data format designed specifically to capture the important elements of the data, and snippy
outputs its results in this format.
Good practice
When reporting how you identified variants for your manuscript or dissertation, you should always state:
- The software tool you used, with its version number and a citation of the paper describing it (if available; provide a URL to the software if there is no paper)
- The parameters used when running the tool (if default parameters were used, state this)
12.1 Using snippy
- Navigate to the
snippy
tool - Select
snippy
- Select
Use a genome from history and build index
fromWill you select a reference genome…
- Choose the
SARS_AY278741.gbk
file inUse the following dataset as the reference sequence
- Select
Paired
underSingle or Paired-end reads
- Choose the cleaned post-
trimmomatic
(R1 paired)
data for the forward reads - Choose the cleaned post-
trimmomatic
(R2 paired)
data for the reverse reads - Click
Run Tool
Video: Identifying sequence variants using
snippy
Caution
snippy
can take a few minutes to run to completion.
12.2 snippy
Output
snippy
provides output in .vcf
format, which is designed to be processed by bioinformatics software and is not the most understandable form for humans. To see the snippy
output in a more human-readable form, click on the eye
icon for the snps table
output (Figure 12.1).
Video: Inspecting
snippy
output
Questions
- Can you find these SNPs manually using JBrowse?
- Are any SNPs different or inconsistent between the
JBrowse
andsnippy
outputs?