Unknown

Dataset Information

0

Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness.


ABSTRACT: Repeated emergence of SARS-CoV-2 variants with increased fitness necessitates rapid detection and characterization of new lineages. To address this need, we developed PyR 0 , a hierarchical Bayesian multinomial logistic regression model that infers relative prevalence of all viral lineages across geographic regions, detects lineages increasing in prevalence, and identifies mutations relevant to fitness. Applying PyR 0 to all publicly available SARS-CoV-2 genomes, we identify numerous substitutions that increase fitness, including previously identified spike mutations and many non-spike mutations within the nucleocapsid and nonstructural proteins. PyR 0 forecasts growth of new lineages from their mutational profile, identifies viral lineages of concern as they emerge, and prioritizes mutations of biological and public health concern for functional characterization.

One sentence summary

A Bayesian hierarchical model of all SARS-CoV-2 viral genomes predicts lineage fitness and identifies associated mutations.

SUBMITTER: Obermeyer F 

PROVIDER: S-EPMC8863165 | biostudies-literature | 2022 Feb

REPOSITORIES: biostudies-literature

altmetric image

Publications

Analysis of 6.4 million SARS-CoV-2 genomes identifies mutations associated with fitness.

Obermeyer Fritz F   Jankowiak Martin M   Barkas Nikolaos N   Schaffner Stephen F SF   Pyle Jesse D JD   Yurkovetskiy Lonya L   Bosso Matteo M   Park Daniel J DJ   Babadi Mehrtash M   MacInnis Bronwyn L BL   Luban Jeremy J   Sabeti Pardis C PC   Lemieux Jacob E JE  

medRxiv : the preprint server for health sciences 20220216


Repeated emergence of SARS-CoV-2 variants with increased fitness necessitates rapid detection and characterization of new lineages. To address this need, we developed PyR <sub>0</sub> , a hierarchical Bayesian multinomial logistic regression model that infers relative prevalence of all viral lineages across geographic regions, detects lineages increasing in prevalence, and identifies mutations relevant to fitness. Applying PyR <sub>0</sub> to all publicly available SARS-CoV-2 genomes, we identif  ...[more]

Similar Datasets

| S-EPMC9161372 | biostudies-literature
| S-EPMC9915511 | biostudies-literature
| S-EPMC10506532 | biostudies-literature
| S-EPMC9603882 | biostudies-literature
| S-EPMC7959702 | biostudies-literature
| S-EPMC8005425 | biostudies-literature
| S-EPMC7375210 | biostudies-literature
| S-EPMC9241832 | biostudies-literature
| S-EPMC9802201 | biostudies-literature
| S-EPMC8959188 | biostudies-literature