Monday, 2-8 pm Berlin time.
Session 1: Introduction to DNA metabarcoding.
In this session participants will be introduced to the key concepts of metabarcoding and we will explain the format of the course. We will outline the different steps of a typical metabarcoding
pipeline and introduce some key concepts. Some examples of results that can be obtained from metabarcoding projects are explained. We also talk about technical replication and other experimental
design considerations, upscaling methods and biases of metabarcoding. Core concepts introduced: high-throughput sequencing, multiplexing, NGS library, metabarcoding pipeline, metabarcoding
marker, clustering algorithms, molecular operational taxonomic unit (MOTU), taxonomic assignment, technical replication, sequencing depth, price per sample, upscaling, methodological
biases.
Session 2: Molecular laboratory protocols. DNA extraction. Metabarcoding markers. PCR and library preparation. Good laboratory practice.
In this session we will learn the basics about molecular laboratory procedures needed for metabarcoding. While there will be no hands-on laboratory practices, guidelines and best practices for
all key laboratory steps will be discussed. We will explain sample collection techniques, including eDNA and bulk community samples, pretreatment and DNA extraction protocols. The diverse
molecular markers available for different kinds of samples and target taxonomic groups will be discussed. Participants will know about sample tags, library tags, adapter sequences, PCR protocols
and library preparation procedures. Core concepts introduced: good laboratory practice, proper sample collection, bulk (community DNA) and eDNA samples, DNA preservation, DNA extraction, PCR,
clean up, metabarcoding marker, universality, specificity, taxonomic range, taxonomic resolution, primer bias, amplification errors, sequencing errors, DNA contaminations, library generation,
sequencing platforms, sample indexing, adapter sequences, index jumping, robotics.
Tuesday, 2-8 pm Berlin time.
Session 3: The APSCALE pipeline and taxonomic assignment with BOLDigger.
In this session, participants will be farmiliarized with the APSCALE pipeline which mainly relies on VSEARCH and cutadapt with a real sequence dataset as an example for learning how to run the
metabarcoding pipeline. Participants will learn about all steps needed to process the raw data provided by the sequencing provider until the OTU or ESV table. Participants will learn how to
demultiplex the reads, perform paired-eng merging, primer trimming, quality filtering, OTU clustering as well as ESV denoising and how all those steps can be performed with the APSCALE pipeline.
Finally, the resulting sequences will be taxonomically assigned with BOLDigger and the BOLD database. All hands-on work will be performed in the shell first, however, GUI versions of all programs
used are available. Core concepts introduced: fastq and fasta formats, Phred quality score, paired-end alignment, demultiplexing, sequence filtering, chimeras, dereplication, unique sequences,
reads, singleton sequences, abundance recalculation, OTU clustering, sequence repositories, identity assignment, BLAST, GenBank, Barcode Of Life Datasystems (BOLD), read denoising, Exact
Sequence Variants (ESVs).
Wednesday, 2-8 pm Berlin time.
Session 4: Session 4: The OBITools&Friends pipeline.
In this session, we will work with the OBITools software suite, using the same dataset we used in the previous session for testing some alternative metabarcoding pipelines. We will also introduce
the use of other programs to complement the OBITools pipeline to optimize the different steps. Among them, algorithms for denoising, clustering sequences into MOTUs with flexible versus fixed
similarity threshold (SWARM), or algorithms for post-clustering collapse of erroneous MOTUs. We will continue learning about phylogenetic algorithms for taxonomic assingment. The ecotag algorithm
will be used for adding taxonomic information to the MOTUs in our example dataset. We will apply post-clustering co-occurrence corrections to remove pseudogenes and oversplitted MOTUs. Core
concepts introduced: step aggregation methods, hard identity threshold, flexible identity threshold, co-occurrence, phylogenetic assignment, best match, assignment of higher taxa.
Thursday, 2-8 pm Berlin time.
Session 5: TaxonTableTools (TTT) to analyze biological data.
In this session, participants will be introduced to TaxonTableTools (TTT), which is a graphical-user-interface software to analyze and visualize DNA metabarcoding data. Participants will first
learn how to convert and process data received from APSCALE and BOLDigger, by using tools for subtracting reads that are present negative controls, merging replicates, and filtering the data for
target organisms. Subsequently, participants will learn how to perform analyses and visualizations to gain first insights into DNA metabarcoding datasets (e.g., number of reads per sample,
OTU/species rarefaction, or alpha diversity). Here, the session will also discuss potential challenges and issues that can be faced when working with DNA metabarcoding data. Furthermore, the beta
diversity of the tutorial datasets will be explored (PCoA and NMDS). To wrap up the session, participants will have the opportunity to freely explore the many functions available in TTT to
investigate the tutorial datasets or their own datasets.
Core concepts introduced: renormalization, taxonomy collapsing, blank correction, taxonomic summary, α-diversity, β-diversity, rarefaction, MOTU richness, non-metric multidimensional
scaling (NMDS), principal components analysis (PCoA), PERMANOVA.
Friday, 2-8 pm Berlin time.
Session 6: Local reference database assembly. Primer design basics.
The participants will learn how to build local reference databases from the information available in public sequence repositories and how to add new custom sequences to these local reference
databases. They will also learn how sequence databases interact with taxonomy databases for retrieving the phylogenetic information for the assignment algorithms. We will use those databases for
the design of metabarcoding PCR primers. Core concepts introduced: taxonomic database, ecoPCR and ecoPCR format, taxonomic identifier (taxid), local vs remote reference database, universality,
specificity, taxonomic range, taxonomic resolution, primer bias, amplification errors, primer degeneracy.
Session 7: Q&A
Participants will have the opportunity to go back to concepts or processes for clarification, or more in-depth explanation. Questions about their specific projects are encouraged. The aim of this
session is to create a collaborative discussion in which the participants can also offer their input and experience for helping out others.