Entering edit mode
7.0 years ago
manekineko
▴
150
Hi, I'm trying to use LoFreq for viral data (BAM), when I know there is a low-frequency SNP for example at a particular position 30.000 reads with original nt and 1500 reads containing SNP which makes ~3% frequency of a viral variant, but LoFreq is outputting no SNP found? Can someone with more experience on viral data can suggest options how to run the tool?
Are you sure those reads are of reasonable quality? How did you determine that there are 1,500 of 30,000 variant reads?
I'm exploring this in IGV, I have explored the Phred quality of the sample and see that plot shows quality above 35.
The base quality or the mapping quality? Do the actual reads with the variant look okay?
I have run the BAM in Geneipus and found maybe the reason why in some tools they are disappearing. For example, the one of the SMP are mostly from reverse reads, so if I checked options for strand-bias the SNP probably are trow out. But in my case, I do not know what should be the option as the sample is from a single strand RNA virus sample.
Looks like you have a strand bias issue:
From: https://sourceforge.net/p/lofreq/discussion/general/thread/ee151ab0/
It shouldn't matter that it's from stranded RNA-seq. The variant frequency should not be different depending on strand. If most reads are one strand, most variant reads should be on the same strand.
You should check some higher frequency variants and see what the reported strand bias is for those.
The strange thing is that even with the option "Disable use of base-alignment quality (BAQ)" LoFreq does not report any SNP
You could try without the automatic lofreq filtering: