Question

Counting reads aligned to forward or reverse strand

0

Entering edit mode

6.3 years ago

caggtaagtat ★ 1.9k

Hello,

I mapped reads (stranded library prep) with STAR to my reference genome and learned, that you can count reads, which aligned to the forward or reverse strand of the reference seuquence, using the FLAG option.

Reads mapped to Forward strand

samtools view -c -F 20 FILE.bam >> Output_file.txt

Reads mapped to Reverse strand

samtools view -c -f 16 FILE.bam >> Output_file.txt

This showed me in all files a strong inbalance of read distribution between the strands, which I did not expect, with a few hundred thousend reads on the plus strand and a few million reads on the reverese strand. The library preparation should not make any difference in this, that's why I am wondering, if I may have something wrong in the code?

There are no multimappers in the data and the data only comes from mitochondrial RNA.

Edit: a word

RNA-Seq • 4.0k views

ADD COMMENT • link updated 6.3 years ago by michael.ante ★ 3.9k • written 6.3 years ago by caggtaagtat ★ 1.9k

score 2 · Answer 1 · 2018-08-06

2

Entering edit mode

6.3 years ago

michael.ante ★ 3.9k

Hi caggtaagtat,

I'd check the gene expression with htseq-count or featureCounts setting the right stranded parameter for your library prep. In human for instance, more than 75% of the mt-genes are located on the + strand. Having little expression on the minus-strand genes would explain your result.

Cheers,

Michael

ADD COMMENT • link 6.3 years ago by michael.ante ★ 3.9k

0

Entering edit mode

Thank you,

I will try to use htseq-count and the RSeQC now. I was just wondering, if I can really access the stranded coverage with these Flags in samtools view or if I got something wrong.

ADD REPLY • link 6.3 years ago by caggtaagtat ★ 1.9k

0

Entering edit mode

Ok, so using the HTseq-count script, I got a much more believable result, with most reads on the heavy strands.

ADD REPLY • link 6.3 years ago by caggtaagtat ★ 1.9k