Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness [preprint]
Schaffner, Stephen F.
Pyle, Jesse D.
Park, Daniel J.
MacInnis, Bronwyn L.
Sabeti, Pardis C.
Lemieux, Jacob E.
UMass Chan AffiliationsProgram in Molecular Medicine
Genetics and Genomics
Immunology and Infectious Disease
MetadataShow full item record
AbstractRepeated emergence of SARS-CoV-2 variants with increased fitness necessitates rapid detection and characterization of new lineages. To address this need, we developed PyR0, a hierarchical Bayesian multinomial logistic regression model that infers relative prevalence of all viral lineages across geographic regions, detects lineages increasing in prevalence, and identifies mutations relevant to fitness. Applying PyR0 to all publicly available SARS-CoV-2 genomes, we identify numerous substitutions that increase fitness, including previously identified spike mutations and many non-spike mutations within the nucleocapsid and nonstructural proteins. PyR0 forecasts growth of new lineages from their mutational profile, identifies viral lineages of concern as they emerge, and prioritizes mutations of biological and public health concern for functional characterization.
Obermeyer F, Jankowiak M, Barkas N, Schaffner SF, Pyle JD, Yurkovetskiy L, Bosso M, Park DJ, Babadi M, MacInnis BL, Luban J, Sabeti PC, Lemieux JE. Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness. medRxiv [Preprint]. 2022 Feb 16:2021.09.07.21263228. doi: 10.1101/2021.09.07.21263228. Update in: Science. 2022 May 24;:abm1208. PMID: 35194619; PMCID: PMC8863165. Link to preprint on medRxiv.
Permanent Link to this Itemhttp://hdl.handle.net/20.500.14038/30735
This article is a preprint. Preprints are preliminary reports of work that have not been certified by peer review.