Bioscope Hard Clip Vs Mismatch
1
0
Entering edit mode
13.9 years ago
Dave ▴ 120

Does anyone know how bioscope decides to Hard Clip vs calling a mismatch ?

For eg :

1781_585_1567 0 validated 586867 54 34M1H * 0
0 ==================================
IIIIIIIIIIII:EI<FIIDDIIIIIII@EIIII RG:Z:20110112194907919
NH:i:1 CM:i:0 CQ:Z:<BBAB>A=B?BB4'?2+<@:+:@A@=</<%A4?>?
CS:Z:T11212021130133312020233301002211032 NM:i:0 MD:Z:34

Why is the above read are not classified as 35 bp read length with 1 mismatch?

Thanks!

solid short aligner sam • 2.6k views
ADD COMMENT
1
Entering edit mode
13.9 years ago

I think it comes from the way the algorithm scores the pairwise alignment. "The hard clipping operation H indicates that the clipped sequence is not present in the sequence field"

It would be a mismatch if the whole sequence of the read was aligned on the reference. But here, for hard clipping, I guess the algorithm choose to end the pairwise alignment before the last base because the score for the alignment would be higher than the one with an extra gap.

ADD COMMENT
0
Entering edit mode

That makes sense. Thanks Pierre!

ADD REPLY

Login before adding your answer.

Traffic: 1716 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6