Question

Question on RNA-seq

0

Entering edit mode

7.8 years ago

miller.time.716 • 0

I'm looking to perform RNA-seq on pairs of cell lines to look at differential gene expression and wondering what I can get away with to lower costs. 2 questions:

1) How many reads would be a good starting point? Would 20M be a reasonable place to start? 2) I understand I should have at least 2 biological replicates per cell line. Is it ok to combine the tubes into 1? Or should they be run separately? 3) RIbo-depletion costs $200 more than poly-A enrichment. Is it better to go with ribodepletion since it encompasses non coding RNA or is it common to try polyA enrichment first and then go with noncoding RNA if nothing is found with polyA?

Thanks in advance!

RNA-Seq • 1.7k views

ADD COMMENT • link updated 7.8 years ago by grant.hovhannisyan ★ 2.6k • written 7.8 years ago by miller.time.716 • 0

1

Entering edit mode

Which genome/size?
Best to have 3+ replicates (to allow for failures somewhere along the way)
If you don't do ribo-depletion large fraction (70-90%) of your reads may be rRNA.

ADD REPLY • link 7.8 years ago by GenoMax 153k

0

Entering edit mode

Sorry, forgot to mention these are human neural stem cell lines.
So I assume you mean that I should run each replicate separately instead of combining into a single tube?
Without ribo-depletion, wouldn't polyA enrichment exclude rRNA?

ADD REPLY • link 7.8 years ago by miller.time.716 • 0

2

Entering edit mode

20M would be a reasonable place to start. Higher numbers would be needed if you want to get alternative splicing, allele specific expression.
I was referring to biological replicates (if possible). Multiple libraries are pooled into one lane.
Sorry about that. It will. I meant removal of rRNA. If you need non-coding RNA then you will have to do depletion.

ADD REPLY • link 7.8 years ago by GenoMax 153k

0

Entering edit mode

Regarding seq. depth, keep in mind 2 things. 1. If you will get bad quality sequencing you will have to do trimming, and depending on the read quality you might trim a substantial amount of data, thus decreasing your seq. depth. 2. For DE analysis what we are interested in are unique alignments to reference genome, as a rule we don't use mulitmapped reads (there are some ways to include also multimaps, using featurecounts for example, but reviewers might not be happy in the end). So usually unique mapping rate should be around 90% if everything is fine. Thus, if you sequence 20mln reads, in a good scenario 85-90% of these data is usable for DE analysis. So always try to reach a bit more to compensate for that 2 factors.

ADD REPLY • link 7.8 years ago by grant.hovhannisyan ★ 2.6k

score 2 · Answer 1 · 2017-12-07

2

Entering edit mode

7.8 years ago

grant.hovhannisyan ★ 2.6k

This paper might be of interest https://academic.oup.com/bioinformatics/article/30/3/301/228651

And this is more broad https://genomebiology.biomedcentral.com/articles/10.1186/s13059-016-0881-8

ADD COMMENT • link 7.8 years ago by grant.hovhannisyan ★ 2.6k

1

Entering edit mode

Also this one:

How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?

Rules of thumb taken from Schurch et al. and Liu et al.: for differential gene expression, 20 million reads seems good (more for more complex analyses, as genomax said), two biological per replicates is still very few, personally I would recommend at least five.

ADD REPLY • link 7.8 years ago by h.mon 35k