I am mapping illumina paired-end reads against a transcriptome with HISAT2 and I'm having a moment of confusion understanding one thing which might be easy actually:
Is it normal to have sam flags that point at the same strand for both reads? The technology output should produce one read that maps on the forward and one read that maps on the reverse strand (when mapping against a genome) but I'm uncertain of what I should observe when mapping against a transcriptome.
I guess the program should map one of the two mates and reverse-complement the other, but does the output flag (and mapping result in general) reflect the mapping strand of the reverse-complement or of the original read? is it normal to have many pairs with both mates on the same strand?
Please help me out of this "theoretical" quicksand.
8 weeks later, the problem rises again in a different form: would you keep 83 and 163? they are the ONLY flags I have after filtering. I find it suspicious because my reads should face each other, they shouldn't map as mate pairs.
83 and 163 are facing each other (as are 99 and 147). Having only this combination makes sense if you're aligning against a transcriptome and have strand-specific data.