Identifying mutations from Paired-End Sequencing data
1
0
Entering edit mode
10.6 years ago
Max Ivon ▴ 140

Hello! I'm trying to get mutations from paired-end sequenced reads aligned with BWA using SamTools. Coverage is about 16,000. Generally it works fine, but in one fragment (TGGGC) i see that in reads sequenced from left to right there is deletion of G (TGGC) in 12,000 out of 14000 cases, but in reads sequenced from right to left there is no this mutation at all. So is there deletion in heterozygote or there is just problem with sequencing (was carried out with Ion Torrent PGM), or there is problem with alignment?

Mutation sequencing Samtools Paired-end • 2.4k views
ADD COMMENT
1
Entering edit mode

Do all the reads that show the deletion begin at the exact same coordinate? Do they otherwise match the reference perfectly? Do the positions of the mate pairs make sense? When you eyeball the alignment in IGV, is coverage even in the surrounding region?

A true heterozygous variant should be seen in reads going both directions, in reads that start at different coordinates. If you don't have that, you are likely looking at an artifact

ADD REPLY
0
Entering edit mode

Reads, showing deletion start from different position. Coverage also differ - from 16k to 16.3k. There also another mutations, but the majority of them - deletion/insertion of C. Does that mean that PGM have bias to ignore C

ADD REPLY
0
Entering edit mode
10.6 years ago
donfreed ★ 1.6k

You can blat the reads if you are worried about the alignment.

My initial thought is that this is a sequencing error. Homopolymers are know to be problematic. The large degree of strand bias is also an indication of a sequencing error.

ADD COMMENT

Login before adding your answer.

Traffic: 3025 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6