chronic_infection_python

Overview

This application was developed by the Computational Analysis, Modelling and Evolutionary Outcomes (CAMEO) pillar of Canada's Coronavirus Variants Rapid Response Network (CoVaRR-Net). Data analysis, code and maintenance of the application are conducted by Erin E. Gill, Fiona S.L. Brinkman, and Sarah Otto.

Background

This application draws from work conducted by Harari et al. (2022) and Feng et al. (2023).

In the first paper, the authors demonstrate that specific lineage-defining mutation patterns occur in SARS-CoV-2 genomes that are sequenced from chronic infections vs. mutations that occurred in SARS-CoV-2 genomes sequenced around the globe at the start of the pandemic (before the rise of Variants of Concern (VOCs)). They also analyzed lineage-defining mutation patterns in VOCs, and concluded that “mutations in chronic infections are predictive of lineage-defining mutations of VOCs”.

Feng et al. sequenced hundreds of SARS-CoV-2 samples obtained from white-tailed deer in the United States. They observed Alpha, Gamma, Delta and Omicron VOCs and determined that the deer infections arose from a minimum of 109 separate transmission events from humans. In addition, the deer were then able to transmit the virus to each other. Deer infections resulted in three documented human zoonoses. The SARS-CoV-2 virus displayed specific adaptation patterns in deer, which differ from adaptations seen in humans.

Application Use

This application accepts a list of comma separated nucleotide positions in a SARS-CoV-2 genome where lineage-defining mutations occur. A list of lineage-defining mutations for pangolin-designated SARS-CoV-2 lineages can be found here. The application determines which mutation distribution best fits your list of mutations (chronic, deer, global (pre-VOC)) via likelihood calculations. The log likelihood that your list fits each distribution is displayed. Likelihoods are calculated based on user-defined bin size (genes with split spike protein, genes, genome split into 500nt windows or genome split into 1000nt windows) as follows:

sum(log(((distribution bin counts + 1) / sum(distribution bin counts + 1)) ^ user bin counts))

Notes on Input

Your list can be formatted with or without nucleotide abbreviations. e.g. C897A, G3431T, A7842G, C8293T,... OR 897, 3431, 7842, 8293,....
Do NOT include insertions or deletions (indels) e.g. ins21608TCATGCCGCTGT, ∆23009-23011.
If you have an unaligned SARS-CoV-2 genome sequence and would like to use this tool, you must first place it into a phylogeny so that you can detect lineage-defining mutations. To get started, you may wish to access the tools associated with the UCSC SARS-CoV-2 Genome Browser.

Feedback

We're pleased to accept any feedback you have. You can submit an issue in the GitHub repository here. You can also email questions, comments or suggestions to erin.gill81(at)gmail.com. You can also leave comments in the Discussions tab.

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
covid-mutation-distribution		covid-mutation-distribution
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

covid-mutation-distribution

covid-mutation-distribution

.DS_Store

.DS_Store

.gitattributes

.gitattributes

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

Repository files navigation

chronic_infection_python

Overview

Background

Application Use

Notes on Input

Feedback

About

Releases

Packages

Languages

License

eringill/chronic_infection_python

Folders and files

Latest commit

History

Repository files navigation

chronic_infection_python

Overview

Background

Application Use

Notes on Input

Feedback

About

Topics

Resources

License

Stars

Watchers

Forks

Languages