Entering edit mode
5.0 years ago
restless.v2
▴
30
Dear all, I am dealing with ultra deep sequencing for virus search in environmental sample. What value of % total reads mapped is considered to be good in this kind of analysis?
I explain better: in RNA-Seq a value of 70% total reads mapped is considered a mark of good alignment. In my current analysis for virus amplicon I have only 1.79% of total reads mapped. Consensus sequence is perfect.
My concern is if actually 98% of the reads are crap or contaminant in this kind of environmental sample. What do you think about?
By definition environmental samples are going to contain things that may never have been previously seen so there is no % value that one could decide as good/bad metric.
In general 70% of reads mapping would be acceptable alignment with a known genome but one would want to go higher.
Is that 1.79% value for a single virus or all known viral genomes in NCBI? If former then that is all you probably have. If latter then you have a very small fraction of reads mapping to known viral sequence.
It's for a single virus. Thank you
What kind of experiment are you doing here? Are you trying to enrich this specific virus?
Virus signal is already enriched with PCR for amplicon of interest (HEV ORF2). Thus my concern about that 1.79% In any case target amplicon was not shown on gel as PCR result, then the hypothesis of very low copy number and NGS investigation. 1.79% could be coherent with a very low signal, so low to fail gel display?
I think it depends... primer design is crucial is this kind of experiments. If you designed a primer specifically for one type of virus and specifically used that virus as a reference in sounds low.
Maybe it is an option to do some otu clustering on your data and blast some high abundant otus. It can give you a insight of the unmapped reads.