Eukaryotic genome assembly using Pacbio and Hi-C

Dates

10-14 November 2025

To foster international participation, this course will be held online

 

 

Overview

 

Eukaryotic genomes may be composed of many repeats. Long reads sequencing technologies can span most of those repeats and produce genome assemblies with unprecedented contiguity. Further, sequencing technologies such as PacBio HiFi give us assembly contiguity and base accuracy. Chromatin conformation capture reads (Hi-C) bring the final information to scaffold assemblies into chromosomes.
 This course will introduce the audience to a specter of methods that are present in a usual assembly workflow, starting from raw data and finishing with a fully assembled genome. We will see how to manipulate raw reads, analyse their quality, how to run different assembly algorithms, how run Hi-C scaffolding algorithms, and how to analyse assembly quality.
 
Structured over five days, this course consists of both theoretical and practical aspects which are intertwined through each day. The presented theoretical foundation will be applied to small eukaryotic datasets.

 

 

Target audience

 

This course is intended for researchers interested in learning the theory and practice of how to perform de novo eukaryotic genome assembly using Pacific Biosciences Long Reads and Hi-C data. Both beginners and more advanced users will find useful information in this course.

 

Learning outcomes

 

  • Understanding PacBio HiFi (mostly), PacBio CLR, and Hi-C data.
  • Understanding the concepts of de novo genome assembly.
  • Obtaining practical experience in using state-of-the-art tools for de novo assembly and assembly quality assessment.

 

Program

Monday. Classes from 2 to 8 pm Berlin time

  • Genomes

  • Sequencing technologies

  • Genome assembly

    • Hands-on: Introduction to Linux – manipulating reads files

Tuesday. Classes from 2 to 8 pm Berlin time

  • What are reads k-mers?

  • K-mer analysis: estimating genome size, heterozygosity, and repeat content

  • Plotting k-mer profiles

  • Presentation of different assembly algorithms: Hicanu, wtdbg2, Falcon, Hifiasm

  • Start assembling different small eukaryotic genomes: butterflies, moths, and others

Wednesday. Classes from 2 to 8 pm Berlin time

  • Purging haplotigs: what is that about?

  • Purging assemblies

  • Evaluation of purged assemblies: contiguity (N50, total length), accuracy (Merqury k-mer evaluation), and gene content (BUSCO) analysis

  • Start polishing and/or Hi-C scaffolding

Thursday. Classes from 2 to 8 pm Berlin time

  • What is Hi-C sequencing? Theory and practice

  • Hands-on: Running analysis on Hi-C scaffolded assemblies – evaluating final contiguity, accuracy, and gene content

Friday. Classes from 2 to 8 pm Berlin time

  • Finishing up analysis – participants will work in groups on the same data, discuss results, and prepare a final presentation

  • General discussion

  • Final questions

 

COst overview

Package 1

530 €

 


Cancellation Policy:

 

 

 

> 30  days before the start date = 30% cancellation fee

 

< 30 days before the start date= No Refund.

 

 

 

Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.