Day 1. 2-8 pm Berlin time
Session 1: Introduction, basecalling and demultiplexing
We will start with a general introduction to sequencing and assembly using Oxford Nanopore Technologies sequencers. Participants will be introduced to
the theory of how Nanopore sequencing works and how raw Nanopore sequencing data is formatted.
Practical sessions will include using current gold standard basecalling tools to transform raw Nanopore signal data into DNA sequence (including an example of multiplexing/demultiplexing in
Nanopore sequencing), and learning how to quality control and filter your basecalled sequencing reads.
Day 2. 2-8 pm Berlin time
Session 2: Benefits and Limitations of Nanopore
In this session we will talk about the benefits and limitations of using Nanopore for genome assembly and discuss the differences between Nanopore and other
sequencing technologies.
Session 3: Genomes, assemble!
Although Nanopore sequencing is capable of generating individual reads which are long enough to span entire viral (and even some bacterial) genomes by themselves, genome assembly will almost
always still require shorter genome fragments to be stitched together. Many assembly tools have been developed or optimised for long reads generally, or for Nanopore reads specifically. Some
assembly tools might be better for your dataset than others, depending on the type of genome you have sequenced, the overall quality you require, your available computing power, and how long you
are willing to wait for your assembly.
Day 3. 2-8 pm Berlin time
Session 4: Get polishing
Some assembly tools produce contigs with better accuracy than others, due to which algorithm they have used for assembly, and whether or not they include sequencing read correction steps.
However, all newly-assembled genomes will usually benefit from some additional polishing. Here, participants will be introduced to polishing tools which use a variety of methods to compare the
sequencing signal or raw reads back to the assembled contigs and make any possible corrections. We will also demonstrate how to use short read data to polish your assembled genomes and improve
their accuracy even further.
Session 5: How good is my genome?
The accuracy of our assembled contigs is important for most downstream applications, including annotation and analysis of structural variants. Here, we will discuss methods for deciding if your
genome assembly is accurate enough to use, whether it contains misassemblies, and whether it is “complete” (and what to do if it is not as good as you might have hoped…).
Day 4. 2-8 pm Berlin time
Session 6: How good is my genome? – Practical
Here we will apply some of what we learned yesterday to assess the quality of our newly assembled genomes.
Session 7: Which assembly method is best for my data?
During session 2, we briefly discussed that different tools are optimal for different types of data. During this session, we will further consider the
different assembly methods available for different sample types. Which assembly tool is best for a haploid genome? What about diploid, or even polyploid? How can you assemble genomes for a
variety of different microbes found in the same metagenomic sample? By the end of this session, you will be armed with the knowledge to decide!
This session will also include a practical demonstration of how to use a specific type of sequencing and assembly which is currently in use globally to sequence SARS-CoV-2 genomes (and which can
also be used for many other viral genomes): tiled amplicon sequencing.
Day 5. 2-8 pm Berlin time
Session 6: Where can I do my analysis?
Gone are the early days of sequencing, when throughput and yield were major concerns. These days, many sequencing runs will produce more data than you know what to do with. It might still be
possible to carry out some stages of your analysis locally, on your own computer, especially if speed is not of high importance to your work. However, other applications might require more
computer power. Here, we will discuss the resources which might be available to you, and how to use them effectively.
Session 7: Over to you
To wrap up the course, you will be given the choice of a variety of different types of pre-basecalled model dataset (metagenomic, viral, mammalian, etc.), and you will spend this final session assembling and QCing your chosen dataset. This will require you to apply all the knowledge you have gained over the last few days, in order to decide which tools to use, and how to use them. You could even practice on your own dataset, if you happen to have one available! Computational resources will be limited on the day, so your own datasets should be relatively small. You can contact us ahead of the course to discuss if your data are suitable.
Floating talks
Some of the tools we are using in the sessions will be slow, so to fill the time while we wait for them to run we will have optional additional talks on some related topics including: Genome
annotation, modified basecalling and how to install tools.