Entering edit mode
5.6 years ago
dsklsdoiwld
•
0
This could be tricky question. If my only goal is to call SNP, so when running BWA, any possibility I could only generate a small bam file where all alignments contain at least one potential variants? ie. The majority of alignments which perfectly match reference will be abandoned.
Or generally my question is, how can be bypass the "useless" perfect-matching alignments, so that only useful and small not-perfect-matching alignments are used? In this way we could optimize the whole pipeline.
Does BWA provide such related function? thx
Can you elaborate on which organism you are working? For example, for diploid organisms, this approach wouldn't work (at all) because you cannot differentiate heterozygous vs homozygous variants.
I also think by this method you will terribly enrich for bad quality reads, leading to spurious variant calls. If the good reads would have been kept then those variants would not get called.
yes, you are right. heter vs homo won't work.....and i'm working on human ref
There are alignment-free variant calling methods, using kmers. I've never tried any, but it's worth giving it a try.