Hi,
I'm new in this business, but I hope my question make sense :)
I'm trying to compare the output from BWA and BT2 from the same mapping.
As I understood, all matches in the sam-files are represented by each line not in the header (i.e. those that do not start with '@')
In the BWA sam-file there are many more of those lines compared to the BT2 sam-file. Moreover, I see that many of the matches in the BWA sam-file have code 4 in the FLAG-column. If I understand the sam-documentation correct, FLAG 4 means that the read is not a match. So how do I interpret a 'match' in the BWA sam-file with FLAG 4?
Again, I am new in this branch, but hope the question makes sense.
Take care!
Kim
Could you show both lines?
How do you mean?
There are a lots of lines. But in the BT2 sam-file none of them has the value 4 inte the FLAG column whereas many output lines in the BWA does.
According to Devon's answer, you could try to remove
unmapped reads
andsecondary alignments
from bwa output bam file usingsamtools
, and then compare both files.OK - thanks again!
Just out of curiosity, if unmapped segments ((FLAG=4) is segment the same thing as a read?)) makes it to the alignment section of the SAM-file, which ones, then, does not make it to the alignment section?
Yes, if a read is unmapped it'll be a single segment. BTW, bowtie2 won't produce alignments of reads in multiple segments (though bwa mem will).