Speaker
Description
The rapid succession of SARS-CoV-2 variants has underscored the importance of tracing the emergence of new subvariants with evolutionary advantages. Based on almost 15 million complete viral sequences from GISAID we investigated how the individual mutations that define Variants of Concern have emerged over time. We found that rather than accumulating mutations one at a time, key changes appeared in clusters, leading to the accelerated emergence of mature variant lineages. The timing and combinatorial nature of the mutations that define each variant reveal strong but hidden selective forces at play.
This observation and analysis of the site frequency spectrum (SFS) of the viral genomes led us to retrospective discovery of a characteristic pattern of the SFS cumulative tails, which includes a discontinuity, that can be traced down to a set of mutations that maintain identity over a certain time interval. Furthermore, the discontinuity shifts over time toward higher frequencies. Ultimately, subvariants with this cluster of mutations dominate over the parental variant. This observation leads to the potential use of prospective SFS tail analysis to identify emerging new viral substrains. In additiion, we present a mechanistic model, which allows quantitative description of the dynemics of this transient process.