Hello,
I have two separate left and right mate sam files from a sample's RNAseq data (these are unmapped sam files after running alignment both right and left mate fastq files) and I need to find out which of these reads come from pairs and which of these reads are left unmapped from either just left or right mate. ie. In the end I should be able to calculate this
1- % Reads common in left and right mate sam files
2- %Reads only in left mate sam file
3- %Reads only in right mate sam file
can someone please suggest on how to do it? Should I read the read names in hash and compare the ids?
Thanks!
flagstat would run on each sam file separately and i think will not compare based on read names between two files. I am trying to write a script to do this now.