Question

Count right and left mate pairs from Sam files

0

Entering edit mode

8.0 years ago

PAn ▴ 20

Hello,

I have two separate left and right mate sam files from a sample's RNAseq data (these are unmapped sam files after running alignment both right and left mate fastq files) and I need to find out which of these reads come from pairs and which of these reads are left unmapped from either just left or right mate. ie. In the end I should be able to calculate this

1- % Reads common in left and right mate sam files
2- %Reads only in left mate sam file
3- %Reads only in right mate sam file

can someone please suggest on how to do it? Should I read the read names in hash and compare the ids?

Thanks!

Samtools Sam • 2.3k views

ADD COMMENT • link updated 8.0 years ago by Brian Bushnell 20k • written 8.0 years ago by PAn ▴ 20

score 1 · Answer 1 · 2016-11-09

1

Entering edit mode

8.0 years ago

Brian Bushnell 20k

You should really map the reads paired, which will (typically) only generate a single sam file containing both mapped and unmapped reads. It's not clear to me how you ended up where you are, or what you're trying to do, but mapping is much more accurate when reads are paired.

ADD COMMENT • link 8.0 years ago by Brian Bushnell 20k

score 0 · Answer 2 · 2016-11-09

0

Entering edit mode

8.0 years ago

abascalfederico ★ 1.2k

I think samtools flagstat will give you that

ADD COMMENT • link 8.0 years ago by abascalfederico ★ 1.2k

0

Entering edit mode

flagstat would run on each sam file separately and i think will not compare based on read names between two files. I am trying to write a script to do this now.

ADD REPLY • link 8.0 years ago by PAn ▴ 20