How to know that your RNA-seq is stranded or not?
8
20
Entering edit mode
10.7 years ago
M K ▴ 660

Is there any way to detect if your rna-seq data is unstranded or stranded

RNA-Seq • 40k views
ADD COMMENT
40
Entering edit mode
7.6 years ago
Wayne ★ 2.1k

In case anybody currently looking comes across this post...

The easy-to-use Salmon will check for you as described here. You can see what the resulting abbreviations correspond to with a nice illustration here.

ADD COMMENT
0
Entering edit mode

+1 on a taking the time to post a more modern solution. I wonder if there is a way to in biostar to highlight answers like this.

ADD REPLY
1
Entering edit mode

Upvoting it and/or selecting it as an accepted answer is the way to go. Commenting as you did is also helpful. Bioinformatics changes more rapidly hence we have to more proactive in marking up the most recent correct answer.

ADD REPLY
0
Entering edit mode

There are some great answers posted already, but just in case you want to learn more about strandness, you can also check this previous post: Read pair orientation : Illumina TruSeq Stranded mRNA library

ADD REPLY
0
Entering edit mode

Salmon is not easy-to-use. Actually, it is impossible to run now as it requires an outdated version of boost (libboost_iostreams.so.1.60.0). This is not a feasible option anymore, so anybody looking at the above answer can ignore it.

ADD REPLY
2
Entering edit mode

This is untrue. Those curious, please see Rob Patro's reply to this assertion here and Dave Carlson's experience noted here.

Furthermore, I just installed it where others can use salmon served via MyBinder here (repo is here) and I see specifically /srv/conda/envs/notebook/lib/libboost_iostreams.so.1.74.0 installed in the Ubuntu system. This is consistent with the current Bioconda recipe here specifying boost-cpp >=1.74.0.

ADD REPLY
8
Entering edit mode
10.7 years ago
Chris Fields ★ 2.2k

A few RNA-Seq QC tools will detect whether a run is strand-specific. For example, the infer_experiment.py script in the following claims to do this (never used this myself, so can't vouch for it):

http://rseqc.sourceforge.net/

ADD COMMENT
0
Entering edit mode

even this seems to require a BAM file to operate, so at that point one could look at the file

ADD REPLY
0
Entering edit mode

Yep. Only other way I can think of is to check whether there is a strand-specific adaptor used, but this normally gets stripped off the sequence prior to the user getting their hands on it (at least our center does).

Actually, I don't recall whether the TruSeq strand-specific adaptor is the same sequence as their other non-strand-specific counterparts, but then again I've never had to worry about checking for this. Seq centers we've worked with are normally pretty explicit in telling us what protocols and adaptors they use.

ADD REPLY
3
Entering edit mode
7.6 years ago

Hi

This image could help.

In stranded example reads are clearly stratified between the two strands

Of course, you need to perform the alignments, get the BAM file and visualize it in any of the software available (SeqMonk, RNAseqViewer, IGB, etc)

ADD COMMENT
2
Entering edit mode
10.7 years ago
Irsan ★ 7.8k
if you don't know what sample prep protocol was used you have to map your reads to a reference genome and look at the sam flags in the bam file. If it is stranded, flags 83, 99, 147 and 163 have the same abundance but in stranded, 2 of these 4 will disappear when you look at either sense or antisense genes only.
ADD COMMENT
0
Entering edit mode

It might be easier to map to the transcriptome than the genome. Then you know you are mapping to the sense side.

Remember that certain protocols map the first read to the sense strand and the second read to the antisense. Others do it the reverse (the first read is antisense).

Do you know the protocol that was used? You should be able to tell from that whether it is stranded. Joshua Levin has a paper from a couple years ago that compared a bunch of stranded protocols.

ADD REPLY
0
Entering edit mode

I'm trying to figure out if my data is stranded or not. Salmon shows that it is, tags in bam file are 99 and 147 but when I plot it in Genome Browser there is almost no difference in expression between the strands. How could that be?

ADD REPLY
0
Entering edit mode

how to look at the sam flags from BAN files?

Thanks,

ADD REPLY
0
Entering edit mode

If it is stranded, flags 83, 99, 147 and 163 have the same abundance but in stranded, 2 of these 4 will disappear when you look at either sense or antisense genes only.

Can you clarify what you mean please. Did you mean to say "unstranded" in one of these instances?

ADD REPLY
1
Entering edit mode
10.7 years ago
Josh Herr 5.8k

If you have a reference you could map to it to find out. There might be another way, but nothing else comes to mind.

ADD COMMENT
1
Entering edit mode
23 months ago
MB ▴ 60

I found that the salmon result can depend on whether the reference was assembled as strand-specific or not. I recommend one of the many great Trinity helper scripts for that, the patterns are very distinct. You can also check whether your reference (i.e. transcriptome) was assembled as non-stranded although you have stranded libraries :)

https://github.com/trinityrnaseq/trinityrnaseq/wiki/Examine-Strand-Specificity

ADD COMMENT
1
1
Entering edit mode
18 months ago
LayneSadler ▴ 90

Disclaimer = I am not an expert and I would appreciate feedback. This got me information that was better than nothing.


1. Build Salmon Index

Download reference transcripts

Human page (for latest release) = https://www.gencodegenes.org/human/

Create Index

salmon index -t gencode.v43.transcripts.fa.gz -i salmon_index --gencode

2. Automated detection

Run quant with libType auto

salmon quant --index=salmon_index --libType A --output delete_me \ 
    -1 end1.fq.gz -2 end2.fq.gz

Examine output for relevant info

...
[2023-07-06 07:37:12.643] [jointLog] [info] Automatically detected most likely library type as IU

Ctrl+Z

Cleanup output dir

rm -rf delete_me

3. Interpret the library type

Description of types = https://salmon.readthedocs.io/en/latest/library_type.html

ADD COMMENT
0
Entering edit mode

Please don't add answers that are similar to ones posted in old threads. Answer about salmon is the one with most upvotes in this thread.

ADD REPLY
0
Entering edit mode

Links aren't answers, they are starting points at best and liable to 404 over time. Links belong in comments. If an accepted answer doesn't provide code, then the same question will be asked repeatedly until someone provides code.

ADD REPLY

Login before adding your answer.

Traffic: 1911 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6