SDS Colloquium, Speaker Joanna Masel

October 27, 2025, 2:30pm in ENR2 S215

When

2:30 – 3:30 p.m., Oct. 27, 2025

Title: Improving amino acid substitution models

Abstract: 
Substitution models capture rates of evolutionary change, and, together with alignments of multiple protein sequences, are key inputs to both maximum likelihood and Bayesian methods for inferring phylogenetic trees. Sequence alignments are notoriously unreliable, but algorithms that remove likely alignment errors also remove some informative residues, making phylogenetic tree inference worse. We assess and recommend best practices for filtering alignments, both while training substitution models and while doing subsequent inference. This includes developing a new filter CLOAK (CLeaning On the basis of Alignment C(K)onsensus), which is less stringent than alternatives. We also compare substitution models trained only on amino acids buried within a protein fold to those trained on amino acids at the surface. Finally, we apply non-equilibrium substitution models to ask how amino acid frequencies evolve differently across populations in which natural selection is more vs. less effective.

Attachments