Question

In RNA-seq data, how to find whether the mRNA is on the forward strand or reverse strand?

1

Entering edit mode

9.0 years ago

yanweng ▴ 80

For strand-specific RNA-seq (by UNG treatment), whether read 1 is reverse complementary to mRNA, and read 2 is the same to mRNA?
For strand-specifc RNA-seq data, what is the best way to separate reads, based on which strand the mRNA is from? In another word, how to group all reads that map to genes on plus strand?

I have made a gtf file containing only genes on plus strand (genes_on_plus_chr.gtf), and used "bedtools intersect -a Aligned.sortedByCoord.out.bam -b genes_on_plus_chr.gtf" trying to pool out all shared region. But I found this is not good enough, because I can still find some reads with flag number 147 and 99 (ideally would be only contain 163 and 83). Since read 1 should be reverse complementary to mRNA while read2 is the same to mRNA for stranded RNA-seq, can I pool out all the reads with flag number either 163 or 83, and these should correspond to mRNA from gene on plus strand?

How I can do the separation on unstranded RNA-seq data, since I don't know the relationship between read1/read2 to the mRNA?

RNA-Seq rna-seq genome next-gen alignment • 4.9k views

ADD COMMENT • link updated 9.0 years ago by Devon Ryan 105k • written 9.0 years ago by yanweng ▴ 80

1

Entering edit mode

Right, so what is your actual biological goal? I imagine you're trying to get counts or something like that. In this case use featureCounts or htseq-count and call it done, there is rarely a reason to use anything else.

For unstranded data you can't ever discern the strand of the original fragment.

ADD REPLY • link 9.0 years ago by Devon Ryan 105k

0

Entering edit mode

I am not interested in differential expression level. My goal is to identify post-transcriptional RNA modification (A to I). To achieve that, I need to first separate the reads based on where them mapped to. For reads mapped to gene on plus strand, I will look for A to G; for gene on reverse strand, I will look for T to C

ADD REPLY • link 9.0 years ago by yanweng ▴ 80

score 2 · Answer 1 · 2016-05-10

2

Entering edit mode

9.0 years ago

Devon Ryan 105k

Ah, you want to look at RNA editing. Details like that tend to be useful when asking for help. Here is a short bit of (untested) python using pysam demonstrating how to do this.

#!/usr/bin/env python
import pysam
infile = pysam.AlignmentFile("input.bam", "r")
forward = pysam.AlignmentFile("forward.bam", "w", template=infile)
reverse = pysam.AlignmentFile("reverse.bam", "w", template=infile)

for read in infile.fetch():
    if read.is_paired:
        if read.is_read1 and read.is_reverse:
            forward.write(read)
        elif read.is_read1:
            reverse.write(read)
        elif read.is_read2 and read.is_reverse:
            reverse.write(read)
        else:
            forward.write(read)
    else:
        if read.is_reverse:
            forward.write(read)
        else:
            reverse.write(read)
forward.close()
reverse.close()
infile.close()

Afterward, do some variant calling on each and filter accordingly.

ADD COMMENT • link 9.0 years ago by Devon Ryan 105k

0

Entering edit mode

Do the reads need to be flipped prior to variant calling? I'd assume any variant caller should identify reads mapping to the negative strand from a BAM file and account for that in the SNP calls.

ADD REPLY • link 9.0 years ago by Vivek ★ 2.7k

1

Entering edit mode

The reads don't need to be flipped, but variant callers are otherwise not strand aware.

ADD REPLY • link 9.0 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks Devon! I tried the script and it worked. The only thing is that the new bam files don't have header, which will cause problem for further process (like samtools view). I tried "samtools reheader forward.bam another.sam", but it gives me messy error.. Any suggestions?

ADD REPLY • link 9.0 years ago by yanweng ▴ 80

0

Entering edit mode

Ah, I bet the output was SAM rather than BAM. Try changing the mode from w to wb.

ADD REPLY • link 9.0 years ago by Devon Ryan 105k

0

Entering edit mode

Thanks Devon! I tried the script and it worked. The only thing is that the new bam files don't have header, which will cause problem for further process (like samtools view). I tried "samtools reheader forward.bam another.sam", but it gives me messy error.. Any suggestions?

ADD REPLY • link 9.0 years ago by yanweng ▴ 80