Question on RNA-seq
1
0
Entering edit mode
7.0 years ago

I'm looking to perform RNA-seq on pairs of cell lines to look at differential gene expression and wondering what I can get away with to lower costs. 2 questions:

1) How many reads would be a good starting point? Would 20M be a reasonable place to start? 2) I understand I should have at least 2 biological replicates per cell line. Is it ok to combine the tubes into 1? Or should they be run separately? 3) RIbo-depletion costs $200 more than poly-A enrichment. Is it better to go with ribodepletion since it encompasses non coding RNA or is it common to try polyA enrichment first and then go with noncoding RNA if nothing is found with polyA?

Thanks in advance!

RNA-Seq • 1.4k views
ADD COMMENT
1
Entering edit mode
  1. Which genome/size?
  2. Best to have 3+ replicates (to allow for failures somewhere along the way)
  3. If you don't do ribo-depletion large fraction (70-90%) of your reads may be rRNA.
ADD REPLY
0
Entering edit mode
  1. Sorry, forgot to mention these are human neural stem cell lines.
  2. So I assume you mean that I should run each replicate separately instead of combining into a single tube?
  3. Without ribo-depletion, wouldn't polyA enrichment exclude rRNA?
ADD REPLY
2
Entering edit mode
  1. 20M would be a reasonable place to start. Higher numbers would be needed if you want to get alternative splicing, allele specific expression.
  2. I was referring to biological replicates (if possible). Multiple libraries are pooled into one lane.
  3. Sorry about that. It will. I meant removal of rRNA. If you need non-coding RNA then you will have to do depletion.
ADD REPLY
0
Entering edit mode

Regarding seq. depth, keep in mind 2 things. 1. If you will get bad quality sequencing you will have to do trimming, and depending on the read quality you might trim a substantial amount of data, thus decreasing your seq. depth. 2. For DE analysis what we are interested in are unique alignments to reference genome, as a rule we don't use mulitmapped reads (there are some ways to include also multimaps, using featurecounts for example, but reviewers might not be happy in the end). So usually unique mapping rate should be around 90% if everything is fine. Thus, if you sequence 20mln reads, in a good scenario 85-90% of these data is usable for DE analysis. So always try to reach a bit more to compensate for that 2 factors.

ADD REPLY
2
Entering edit mode
ADD COMMENT
1
Entering edit mode

Also this one:

How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use?

Rules of thumb taken from Schurch et al. and Liu et al.: for differential gene expression, 20 million reads seems good (more for more complex analyses, as genomax said), two biological per replicates is still very few, personally I would recommend at least five.

ADD REPLY

Login before adding your answer.

Traffic: 2599 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6