Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Access keysNCBI HomepageMyNCBI HomepageMain ContentMain Navigation
. 2021 Feb 16;12(1):780.
doi: 10.1038/s41467-021-21034-5.

Predicting mammalian hosts in which novel coronaviruses can be generated

Affiliations

Predicting mammalian hosts in which novel coronaviruses can be generated

Maya Wardeh et al. Nat Commun..

Abstract

Novel pathogenic coronaviruses - such as SARS-CoV and probably SARS-CoV-2 - arise by homologous recombination between co-infecting viruses in a single cell. Identifying possible sources of novel coronaviruses therefore requires identifying hosts of multiple coronaviruses; however, most coronavirus-host interactions remain unknown. Here, by deploying a meta-ensemble of similarity learners from three complementary perspectives (viral, mammalian and network), we predict which mammals are hosts of multiple coronaviruses. We predict that there are 11.5-fold more coronavirus-host associations, over 30-fold more potential SARS-CoV-2 recombination hosts, and over 40-fold more host species with four or more different subgenera of coronaviruses than have been observed to date at >0.5 mean probability cut-off (2.4-, 4.25- and 9-fold, respectively, at >0.9821). Our results demonstrate the large underappreciation of the potential scale of novel coronavirus generation in wild and domesticated animals. We identify high-risk species for coronavirus surveillance.

PubMed Disclaimer

Conflict of interest statement

The authors declare no competing interests.

Figures

Fig. 1
Fig. 1. Model predictions for potential hosts of SARS-Cov-2.
Predicted hosts are grouped by order (inner circle). Middle circle presents probability of association between host and SARS-CoV-2 (grey scale indicates predicted associations with probability in range > 0.5 to ≤0.75. Red scale indicates predicted associations with probability in range > 0.75 to <0.9821. Blue to purple scale present indicates associations with probability ≥ 0.9821). Yellow bars represent number of coronaviruses (species or strains) observed to be found in each host. Blue stacked bars represent other coronaviruses predicted to be found in each host by our model. Predicted coronaviruses per host are grouped by prediction probability into three categories (from inside to outside): ≥0.9821, >0.75 to <0.9821 and >0.5 to ≤0.75. Results for humans and lab rodents are not shown to prevent the scale from contracting and making other comparisons difficult. Supplementary Fig. 14 illustrates full results including these hosts. Full results are listed in Supplementary Data 1.
Fig. 2
Fig. 2. Observed and predicted mammalian hosts for coronaviruses.
Columns present mammalian hosts in four categories: Artiodactyla and Perissodactyla (top 10 hosts by number of predicted coronaviruses that could be found in each host), Carnivora (top 15 hosts), Chiroptera (top 15 hosts), Rodentia (top 5 hosts) and others (top 5 hosts). Rows present viruses ordered into five taxonomic groups: alphacoronaviruses, betacoronaviruses, deltacoronaviruses, gammacoronaviruses and unclassified Coronavirinae. Yellow cells represent observed associations between the host and the coronavirus. Grey/red/blue cells indicate the probability of predicted associations in three increasing probability ranges. White cells indicate no known or predicted association between host and virus (beneath cut-off probability of 0.5). Supplementary Data 4 lists full results. These results exclude humans and lab rodents. Supplementary Data 5 lists predictions for humans. Supplementary Fig. 15 illustrates full results including these hosts.
Fig. 3
Fig. 3. Bipartite networks linking coronaviruses with mammalian hosts.
Panel (A): original bipartite network based on known/observed virus–host associations extracted from meta-data accompanying genomic sequences and supplemented with publications data from the ENHanCEd Infectious Diseases Database (EID2). Panels (BD) show predicted bipartite networks using our predicted virus–host associations at different cut-offs: ≥0.9821, >0.75 and >0.5, respectively, for mean probability of associations.

Comment in

Similar articles

Cited by

References

    1. Andersen KG, Rambaut A, Lipkin WI, Holmes EC, Garry RF. The proximal origin of SARS-CoV-2. Nat. Med. 2020;26:450–452. - PMC - PubMed
    1. Corman VM, Muth D, Niemeyer D, Drosten C. Hosts and sources of endemic human coronaviruses. Adv. Virus Res. 2018;100:163–188. - PMC - PubMed
    1. He JF, et al. Molecular evolution of the SARS coronavirus, during the course of the SARS epidemic in China. Science. 2004;303:1666–1669. - PubMed
    1. Guan Y, et al. Isolation and characterization of viruses related to the SARS coronavirus from animals in Southern China. Science. 2003;302:276–278. - PubMed
    1. Rota PA, et al. Characterization of a novel coronavirus associated with severe acute respiratory syndrome. Science. 2003;300:1394–1399. - PubMed

Publication types

LinkOut - more resources

close