31 March-3 April 2025
Due to the COVID-19 outbreak, this course will be held online
In recent years, machine and deep learning techniques are increasingly being used in evolutionary studies due to their flexible and data-hungry nature, suitable to analyze large and complex
genomic datasets. This course will focus on using deep learning, specifically Convolutional Neural Networks (CNN), to extract information from genetic data for population genomics and
phylogeography inference. The theoretical background for simulating genetic data and developing machine and deep learning architectures will be covered and followed by practical examples, in
modules structured over four days. On the first day, the participants will learn how to simulate genetic data under competing demographic scenarios and use ABC for their inference. Day 2 will
include an introduction to machine learning and its applications to evolutionary genomics. In Day 3, deep learning will be introduced and used to compare the demographic scenarios conceived in
previous days. Day 4 will be dedicated to the simulation of genomic regions with selective sweeps and using CNN to detect such regions on real genomes. The course is structured to include
lectures with discussions of key concepts and practical hands-on sessions, contextualised with research study cases.
The course is aimed at graduate students, researchers and professionals interested in genetics, evolution and deep learning, interested in developing applications to test explicit demographic
hypotheses and search for selective sweeps. The course will include both general concepts of genetic data simulations and deep learning but will also include more advanced discussion on advanced
details on their internal machinery. The examples discussed during the course will span datasets for both model organisms, for which whole genomes are available, and non-model organisms with less
available information.
Monday – Classes from 2 to 8 pm Berlin time
- Introduction to coalescent theory and how to model genetic diversity
- How to choose summary statistic and use them in a simple Approximate Bayesian Computation (ABC) framework
- Practical: building a script to simulate genetic data under competing demographic scenarios and perform a simple ABC analysis.
Tuesday – Classes from 2 to 8 pm Berlin time
- A gentle introduction to Machine Learning: supervised vs unsupervised learning, regression and classification
- Simple Machine Learning approaches with summary statistics from genomic data
- Practical: Demographic inference with a simple Machine Learning architecture and summary statistics
Wednesday – Classes from 2 to 8 pm Berlin time
- Understanding the basic CNN architecture for image recognition
- Using CNN to learn directly from genetic data
- Practical: Comparing demographic scenarios with deep learning
Thursday – Classes from 2 to 8 pm Berlin time
- Introduction to approaches for detecting selection
- Recognizing signatures of selection with deep learning
- Practical: simulating genetic data and using CNN to predict whether a given locus is under selection
Should you have any further questions, please send an email to info@physalia-courses.org
Cancellation Policy:
> 30 days before the start date = 30% cancellation fee
< 30 days before the start date= No Refund.
Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.