Hello! I'm trying to get mutations from paired-end sequenced reads aligned with BWA using SamTools. Coverage is about 16,000. Generally it works fine, but in one fragment (TGGGC) i see that in reads sequenced from left to right there is deletion of G (TGGC) in 12,000 out of 14000 cases, but in reads sequenced from right to left there is no this mutation at all. So is there deletion in heterozygote or there is just problem with sequencing (was carried out with Ion Torrent PGM), or there is problem with alignment?
Do all the reads that show the deletion begin at the exact same coordinate? Do they otherwise match the reference perfectly? Do the positions of the mate pairs make sense? When you eyeball the alignment in IGV, is coverage even in the surrounding region?
A true heterozygous variant should be seen in reads going both directions, in reads that start at different coordinates. If you don't have that, you are likely looking at an artifact
Reads, showing deletion start from different position. Coverage also differ - from 16k to 16.3k. There also another mutations, but the majority of them - deletion/insertion of C. Does that mean that PGM have bias to ignore C