2-6 December 2024
To foster international participation, this course will be held online
Phylogenetic inference and divergence-time estimation with genomic data sets
Recent advances in sequencing technology, and the rapid increase in the availability of genetic data, have revolutionized the field of phylogenetics. While genomic data promise unprecedented insights into the evolution of the tree of life, they also pose new challenges that must be addressed to avoid misleading results and to fully leverage the potential of the genome-scale data sets. These challenges include the identification of orthologuous sequences that are suitable as phylogenetic markers, the selection of appropriate models of sequence evolution, and the detection of gene-tree discordance due to incomplete lineage sorting and introgression. In this workshop we will present theory and exercises to infer time-calibrated phylogenies from multi-locus genome data sets while accounting for these confounding factors.
The workshop will be delivered over the course of five days. Each day will include an introductory lecture with class discussion of key concepts. The remainder of each day will consist of
practical hands-on sessions. These sessions will involve a combination of both mirroring exercises with the instructors to demonstrate a skill as well as applying these skills on your own to
complete individual exercises. After and during each exercise, interpretation of results will be discussed as a group.
Computing will be done using tools installed in a preconfigured AWS ec2 server. This will allow us to focus on the theory and options of the actual methods . We will devote short time slots to
troubleshoot installation and running problems for those that want to use the software on their computers but this will not be part of the main workshop.
This workshop is aimed at researchers, PhD or postdoc level planning to infer phylogenetic relationships and divergence times from multilocus data and has no or little prior experience.
Attendants should have basic knowledge of UNIX and will need to use the command line on their laptops. Familiarity with a scripting language such as Python, or R will be helpful but is not
required.
Monday – Classes from 12 to 6 pm Berlin time
Lecture (12:00-14:00): Basic phylogenetics concepts and phylogenetic inference under the Maximum likelihood (Paschalia & Tomas)
- Basic concepts in phylogenetics
- Overview of phylogenomics pipeline
- Overview of phylogenetic inference methods
- Substitution models
- Maximum likelihood inference
Lab (14:30-18:00): Sequence alignment/filtering (Paschalia & Tomas)
- Alignment with MAFFT and alignment filtering
- Model selection
- Maximum likelihood phylogenetic inference with RAxML and IQTree
- Likelihood calculation - optional
Tuesday – Classes from 12 to 6 pm Berlin time
Lecture (12:00-14:00): Bayesian inference - MCMC (Mario)
Lab (14:30-18:00): Phylogenetic inference methods (Sandra)
- Bayesian phylogenetic inference - PhyloBayes
Wednesday – Classes from 12 to 6 pm Berlin time
Lecture (12:00-14:00): Introduction to the Multispecies Coalescent Model (Ziheng)
Lab (14:30-18:00): Analysis of genomic data under the MSC (Paschalia & Tomas)
- Species-tree inference with ASTRAL
- Bayesian species-tree inference with BPP
Thursday – Classes from 12 to 6 pm Berlin time
Lecture (12:00-13:00): Models of gene flow (introgression & migration) (Ziheng)
Lab (13:30-18:00): Analysis of genomic data to identify and quantify gene flow. (Paschalia & Tomas)
- Gene flow identification with ABBA-BABA
- Gene flow quantification with the MSC-I and MSC-M models with BPP
Friday – Classes from 12 to 6 pm Berlin time
Lecture (12:00-14:00): Divergence times estimation (Mario)
Lab (14:30-18:00): Divergence times estimation using MCMCTree (Sandra)
Cancellation Policy:
> 30 days before the start date = 30% cancellation fee
< 30 days before the start date= No Refund.
Physalia-courses cannot be held responsible for any travel fees, accommodation or other expenses incurred to you as a result of the cancellation.