10-14 November 2025
To foster international participation, this course will be held online
Eukaryotic genomes may be composed of many repeats. Long reads sequencing technologies can span most of those repeats and produce genome assemblies with unprecedented contiguity. Further,
sequencing technologies such as PacBio HiFi give us assembly contiguity and base accuracy. Chromatin conformation capture reads (Hi-C) bring the final information to scaffold assemblies into
chromosomes.
This course will introduce the audience to a specter of methods that are present in a usual assembly workflow, starting from raw data and finishing with a fully assembled genome. We will
see how to manipulate raw reads, analyse their quality, how to run different assembly algorithms, how run Hi-C scaffolding algorithms, and how to analyse assembly quality.
Structured over five days, this course consists of both theoretical and practical aspects which are intertwined through each day. The presented theoretical foundation will be applied to small
eukaryotic datasets.
This course is intended for researchers interested in learning the theory
and practice of how to perform de novo eukaryotic genome assembly using Pacific Biosciences Long Reads and Hi-C data.
Both beginners and more advanced users will find useful information in this course.
Monday. Classes from 2 to 8 pm Berlin time
Genomes
Sequencing technologies
Genome assembly
Hands-on: Introduction to Linux – manipulating reads files
Tuesday. Classes from 2 to 8 pm Berlin time
What are reads k-mers?
K-mer analysis: estimating genome size, heterozygosity, and repeat content
Plotting k-mer profiles
Presentation of different assembly algorithms: Hicanu, wtdbg2, Falcon, Hifiasm
Start assembling different small eukaryotic genomes: butterflies, moths, and others
Wednesday. Classes from 2 to 8 pm Berlin time
Purging haplotigs: what is that about?
Purging assemblies
Evaluation of purged assemblies: contiguity (N50, total length), accuracy (Merqury k-mer evaluation), and gene content (BUSCO) analysis
Start polishing and/or Hi-C scaffolding
Thursday. Classes from 2 to 8 pm Berlin time
What is Hi-C sequencing? Theory and practice
Hands-on: Running analysis on Hi-C scaffolded assemblies – evaluating final contiguity, accuracy, and gene content
Friday. Classes from 2 to 8 pm Berlin time
Finishing up analysis – participants will work in groups on the same data, discuss results, and prepare a final presentation
General discussion
Final questions
Cancellation Policy:
> 30 days before the start date = 30% cancellation fee
< 30 days before the start date= No Refund.
Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.