Question

How to know that your RNA-seq is stranded or not?

20

Entering edit mode

11.3 years ago

M K ▴ 660

Is there any way to detect if your rna-seq data is unstranded or stranded

RNA-Seq • 44k views

ADD COMMENT • link updated 20 months ago by kathryn.jacksonjones • 0 • written 11.3 years ago by M K ▴ 660

3

Entering edit mode

8.2 years ago

Carlos Caicedo ▴ 210

Hi

This image could help.

In stranded example reads are clearly stratified between the two strands

Of course, you need to perform the alignments, get the BAM file and visualize it in any of the software available (SeqMonk, RNAseqViewer, IGB, etc)

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 8.2 years ago by Carlos Caicedo ▴ 210

2

Entering edit mode

11.3 years ago

Irsan ★ 7.8k

if you don't know what sample prep protocol was used you have to map your reads to a reference genome and look at the sam flags in the bam file. If it is stranded, flags 83, 99, 147 and 163 have the same abundance but in stranded, 2 of these 4 will disappear when you look at either sense or antisense genes only.

ADD COMMENT • link 11.3 years ago by Irsan ★ 7.8k

0

Entering edit mode

It might be easier to map to the transcriptome than the genome. Then you know you are mapping to the sense side.

Remember that certain protocols map the first read to the sense strand and the second read to the antisense. Others do it the reverse (the first read is antisense).

Do you know the protocol that was used? You should be able to tell from that whether it is stranded. Joshua Levin has a paper from a couple years ago that compared a bunch of stranded protocols.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 11.3 years ago by Michele Busby ★ 2.2k

0

Entering edit mode

I'm trying to figure out if my data is stranded or not. Salmon shows that it is, tags in bam file are 99 and 147 but when I plot it in Genome Browser there is almost no difference in expression between the strands. How could that be?

ADD REPLY • link 7.0 years ago by marina.v.yurieva ▴ 580

0

Entering edit mode

how to look at the sam flags from BAN files?

Thanks,

ADD REPLY • link 4.7 years ago by Kai_Qi ▴ 130

0

Entering edit mode

If it is stranded, flags 83, 99, 147 and 163 have the same abundance but in stranded, 2 of these 4 will disappear when you look at either sense or antisense genes only.

Can you clarify what you mean please. Did you mean to say "unstranded" in one of these instances?

ADD REPLY • link 20 months ago by kathryn.jacksonjones • 0

1

Entering edit mode

11.3 years ago

Josh Herr 5.8k

If you have a reference you could map to it to find out. There might be another way, but nothing else comes to mind.

ADD COMMENT • link updated 5.6 years ago by Ram 45k • written 11.3 years ago by Josh Herr 5.8k

1

Entering edit mode

2.5 years ago

MB ▴ 60

I found that the salmon result can depend on whether the reference was assembled as strand-specific or not. I recommend one of the many great Trinity helper scripts for that, the patterns are very distinct. You can also check whether your reference (i.e. transcriptome) was assembled as non-stranded although you have stranded libraries :)

https://github.com/trinityrnaseq/trinityrnaseq/wiki/Examine-Strand-Specificity

ADD COMMENT • link 2.5 years ago by MB ▴ 60

1

Entering edit mode

2.5 years ago

Ming Tommy Tang ★ 4.7k

use this https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-022-04572-7

ADD COMMENT • link 2.5 years ago by Ming Tommy Tang ★ 4.7k

1

Entering edit mode

2.1 years ago

LayneSadler ▴ 90

Disclaimer = I am not an expert and I would appreciate feedback. This got me information that was better than nothing.

1. Build Salmon Index

Download reference transcripts

Human page (for latest release) = https://www.gencodegenes.org/human/

Create Index

salmon index -t gencode.v43.transcripts.fa.gz -i salmon_index --gencode

2. Automated detection

Run quant with libType auto

salmon quant --index=salmon_index --libType A --output delete_me \ 
    -1 end1.fq.gz -2 end2.fq.gz

Examine output for relevant info

...
[2023-07-06 07:37:12.643] [jointLog] [info] Automatically detected most likely library type as IU

Ctrl+Z

Cleanup output dir

rm -rf delete_me

3. Interpret the library type

Description of types = https://salmon.readthedocs.io/en/latest/library_type.html

ADD COMMENT • link 2.1 years ago by LayneSadler ▴ 90

0

Entering edit mode

Please don't add answers that are similar to ones posted in old threads. Answer about salmon is the one with most upvotes in this thread.

ADD REPLY • link 2.1 years ago by GenoMax 152k

0

Entering edit mode

Links aren't answers, they are starting points at best and liable to 404 over time. Links belong in comments. If an accepted answer doesn't provide code, then the same question will be asked repeatedly until someone provides code.

ADD REPLY • link 2.1 years ago by LayneSadler ▴ 90

score 40 · Accepted Answer · 2017-05-16

40

Entering edit mode

8.2 years ago

Wayne ★ 2.1k

In case anybody currently looking comes across this post...

The easy-to-use Salmon will check for you as described here. You can see what the resulting abbreviations correspond to with a nice illustration here.

ADD COMMENT • link 8.2 years ago by Wayne ★ 2.1k

0

Entering edit mode

+1 on a taking the time to post a more modern solution. I wonder if there is a way to in biostar to highlight answers like this.

ADD REPLY • link 8.2 years ago by Chris Fields ★ 2.2k

1

Entering edit mode

Upvoting it and/or selecting it as an accepted answer is the way to go. Commenting as you did is also helpful. Bioinformatics changes more rapidly hence we have to more proactive in marking up the most recent correct answer.

ADD REPLY • link 8.2 years ago by Istvan Albert 102k

0

Entering edit mode

There are some great answers posted already, but just in case you want to learn more about strandness, you can also check this previous post: Read pair orientation : Illumina TruSeq Stranded mRNA library

ADD REPLY • link 6.8 years ago by igor 13k

0

Entering edit mode

Salmon is not easy-to-use. Actually, it is impossible to run now as it requires an outdated version of boost (libboost_iostreams.so.1.60.0). This is not a feasible option anymore, so anybody looking at the above answer can ignore it.

ADD REPLY • link 2.6 years ago by yh1126 • 0

2

Entering edit mode

This is untrue. Those curious, please see Rob Patro's reply to this assertion here and Dave Carlson's experience noted here.

Furthermore, I just installed it where others can use salmon served via MyBinder here (repo is here) and I see specifically /srv/conda/envs/notebook/lib/libboost_iostreams.so.1.74.0 installed in the Ubuntu system. This is consistent with the current Bioconda recipe here specifying boost-cpp >=1.74.0.

ADD REPLY • link 2.6 years ago by Wayne ★ 2.1k

Ram · Accepted Answer · 2014-04-23

8

Entering edit mode

11.3 years ago

Chris Fields ★ 2.2k

A few RNA-Seq QC tools will detect whether a run is strand-specific. For example, the infer_experiment.py script in the following claims to do this (never used this myself, so can't vouch for it):

http://rseqc.sourceforge.net/

ADD COMMENT • link updated 5.8 years ago by Ram 45k • written 11.3 years ago by Chris Fields ★ 2.2k

0

Entering edit mode

even this seems to require a BAM file to operate, so at that point one could look at the file

ADD REPLY • link 11.3 years ago by Istvan Albert 102k

0

Entering edit mode

Yep. Only other way I can think of is to check whether there is a strand-specific adaptor used, but this normally gets stripped off the sequence prior to the user getting their hands on it (at least our center does).

Actually, I don't recall whether the TruSeq strand-specific adaptor is the same sequence as their other non-strand-specific counterparts, but then again I've never had to worry about checking for this. Seq centers we've worked with are normally pretty explicit in telling us what protocols and adaptors they use.

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 11.3 years ago by Chris Fields ★ 2.2k