Viral co-infection strains decomposition
1
0
Entering edit mode
4.4 years ago

Hello everyone, I have BAM files of viral reads for SARS-CoV-2 and many variants interestingly have 2-3 big fractions (like C->T 40%, C->G 45% on the same position). I am not an expert in Virology and wonder where it comes from. The only explanation I have is the "multiple strains hypothesis" - i.e. it is either co-infection of multiple viral strains or co-development of strains already inside the host (or both?). In this case it would be great to separate these strains in silico, i.e. from one BAM to get FASTA of 1st putative strain in the host, 2nd putative strain and so on. Does anybody have an idea on algorithms/software/publications, that could have approached this?

rna-seq SNP next-gen virus SARS-CoV-2 • 915 views
ADD COMMENT
0
Entering edit mode

Were you able to find a method to use for your work? I am curious as I have encountered the same issues myself.

ADD REPLY
1
Entering edit mode
4.4 years ago
psjalma ▴ 10

That is interesting, I am not an expert on computational aspects of your question. However, if the quantities differ consistently and by significant numbers in terms of the predominance of the reads corresponding to two (or more) genotypes, (for example, strain#1 with 60% reads and strain#2 with ~40% reads, then it should be possible to reliably classify them and such positions can be placed in two columns by sorting the VCF file using an excel formula (or sorting the bases where the proportion ranges between 20-45% and those where the proportion ranges from 55-80%. In one of our paper, the bioinformatician colleagues had done exactly the same work so as to differentiate the two genotypes after experimental inoculation, you can refer the details (since the time of this work, I have moved to another Institute). https://pubmed.ncbi.nlm.nih.gov/29665434/

The SNPs which are present in both the strains would be present in over 95% of the reads and can be considered common to both (or all) the strains.

Indeed, I am looking for an analysis of a ~1000 bp amplicon sequencing where we want to see if there are mixed genotypes. I will appreciate your suggestions regarding the same.

Best wishes and regards and hope all of your remain healthy and safe wherever you are.

Pushpendra

ADD COMMENT

Login before adding your answer.

Traffic: 1988 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6