19-22 May 2025
To foster international participation, this course will be held online
Genome Annotation plays a pivotal role in genomics, where the quality of a genome assembly is intrinsically tied to the accuracy and completeness of its annotation. In this course, we will delve
into the essential processes and strategies required to start the annotation of your target genome addressing challenges posed by genome characteristics and specificities.
The course will start with an overview of the samples’ quality needed to make the best out of the downstream process and then the state-of-the-art sequencing technologies used for genome annotation, discussing pros and cons of each platform. Then, we will go through the basic strategies for genome annotation, which are prediction, ab initio and de novo transcriptome assembly, mainly for processing short read data. This will include learning about different approaches for annotating protein-coding genes in eukaryotic species via projection and evidence guided gene prediction. Discuss the challenges of annotation in different contexts. We will build gene models, explore the use of combiners for integrating alternative gene predictions and assess the accuracy of different annotation tools, assess the output quality and visualise it.
PhD students and post-doctoral researchers and research scientists who are undertaking projects involving annotating a genome assembly and looking to improve their knowledge on different approaches and pipelines.
Basic knowledge of the command line is essential, and candidates should feel comfortable using Unix Shell commands. Additionally, having basic knowledge of Python would be also beneficial.
By the end of this course, attendees will :
1) have the basic knowledge and practical skills necessary to start the annotation process of your genome of interest;
2) be able to tackle genome specific challenges;
3) understand the vital role that high-quality genome annotations play in advancing our understanding of biological processes, paving the way for groundbreaking discoveries and research.
Monday - 2-8 PM Berlin time
Introduction to NGS technologies and practical considerations for isolating high quality RNA
Data QC
Genome annotation: Challenges and Overview of different strategies
Technical note: File format specification
Ab initio transcriptome assembly
Hands-on: Mapping and short-reads ad initio transcriptome assembly.
Tuesday - 2-8 PM Berlin time
De novo and Reference-guided ab initio short read transcriptome assembly using Trinity
Hands-on: Visual and Comparative checks to evaluate the different annotations.
Ab initio with Long read data (Iso-seq) processing
Hands-on: Mapping and short-reads ab initio transcriptome assembly.
Assessing initial quality ab initio annotation quality checks
Hands-on: Visual and Comparative checks to evaluate the different annotations.
Wednesday - 2-8 PM Berlin time
Selection and accurate detection of splice junctions from RNA-seq
Hands on: Assessing false positive and true positive splice junctions reads
Evidence-guided gene prediction using BRAKER/Augustus
Hands on: Protein coding and UTR prediction using evidence based BRAKER pipeline
Leveraging multiple transcriptome assembly methods for improved gene structure annotation
Hands on: Merging the multiple transcriptome assemblies built so far to obtain a the best transcript model.
Thursday - 2-8 PM Berlin time
Questions and open lab. Students can re-work to fine tune the pipeline built so far or start implementing a new pipeline for their own data.
Should you have any further questions, please send an email to info@physalia-courses.org
Cancellation Policy:
> 30 days before the start date = 30% cancellation fee
< 30 days before the start date= No Refund.
Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.