quite confused about this two terminology. I'm reading Pindel, the split-read algorithm. The author seems to make use of the information of "unmapped" reads. Also there are other split-read-based algorithm, which uses "soft-clipped" reads, which are the unaligned parts of reads.
In my eyes, the two look quite similar. Say we have a 100bp read, 50bp of which cannot map while the 50bp can. Then how would BWA categorize this read? Will BWA think this is "unmapped" read since 50bp cannot be mapped; or it's "mapped" but with 50bp "soft-clipped" sequences?
Or BWA has a scoring system for mapping, which sets a threshold for distinguishing the two?
thx
edit: maybe this is related to "centeredness"? say, if breakpoint locates at 99:1; then this 99bp will be mapped with 1bp as "soft-clipped" sequences. But for 50:50, then BWA may regard it as "unmapped"
The mapping quality (5th field) is only 17, which equates to a 0.01995262% chance the mapping is incorrect which is quite high when you are mapping millions of reads.