Now need to use Pindel(split-read approach to identify CNV). I'm just wondering how I can extract one-end mapped reads from BWA result? Does samtools have such options, or we need to extract by parsing the sam/bam file according to flags?
Thanks
Edit: Just another question. After BWA running, I actually do the filtering to pick out those mapping with MAPQ > 20; will this affect picking up unmapped/one-end mapped reads?
I guess this won't affect finding one-end mapped reads; but those two-end unmapped reads cannot be picked up.
Edit again: This is my another question: I ran the command:
samtools view -u -f 8 -F 260 map.bam > oneEndMapped.bam
And let's look at the line of oneEndMapped.bam:
HWI-ST150_0129:2:1:1414:2357#0 153 chr1 154432976 37 101M = 154432976 0 ACATACATATGTATGTAATATGTGATTATTACTGGTGGTTTGCTAAATAATGACAAATGGAAAAAAATATATGTCATCCAAAGAGAGCCACTGTTAATTTT BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBdc^ddcacc\TdcdddddYbe_dffddcffcffecedecedaeeeedd\def^ffeff`c RG:Z:101127_s_2 XT:A:U NM:i:2 SM:i:37 AM:i:0 X0:i:1 X1:i:0 XM:i:2 XO:i:0 XG:i:0 MD:Z:16G20A63
How to interpret this: To me, this looks like both ends have the same sequences, and map to the same position. (leftmost coordinate of one read; and leftmost of another). Why do we say this represents one read maps while the other one doesn't map? To describe "one read maps while the other one doesn't map", in my mind, it should be like:
HWI-ST150_0129:2:1:1414:2357#0 153 chr1 154432976 37 101M = * *
Also running
samtools view -u -f 4 -F 264 map.bam > oneEndUnmapped.bam
This oneEndUnmapped.bam is only 8K, means nothing. I don't quite understand this.
thx
If all you did was select SAM mapping entries with score of > 20 and left the entire line intact, there shouldn't be any problem using the samtool view commands I posted.
Actually this inspires me to think about what MAPQ stands for. How it's calculated. I mean if one-end maps correctly with a probability p; while each of the both-mapping ends maps correctly with the same probability. Then will the MAPQ be the same? Or the both-mapping MAPQ twice the one-end?
Also I look at the samtools/view man page, and seems
-f
is required flag; while-F
is filtering flag.And there's a page for explaining SAM flags:http://picard.sourceforge.net/explain-flags.html.....Can anyone kindly explain a little about required vs filtering flag? thxHi I edited my response to explain the
-f
and-F
.