Question

ERCC Spike-in analysis without using any packages

0

Entering edit mode

5.7 years ago

anikb ▴ 40

Can anyone please help me understand how to perform ERCC spike-in analysis and interpret the results? I know there are some packages to do it, I used erccdashboard but I keep getting a label error, so I was thinking if I could do it on Python without using any ercc analysis packages. So, basically I want to understand the steps in analyzing ERCC.

Here is the experiment design: Time course, Treatment vs control, 6 biological replicates (no technical replicates), Baseline at Time 0, Treated and control RNA samples collected 2, 4, 6, 8 hours, Same amount (100 ng input; 2 ul of 1:1000 dil) of ERCC mix 1 was spiked-in in all samples.

I appended the ERCC fasta and gtf file to human .fa and .gtf files respectively, and aligned reads using STAR, got the read counts using --quantMode GeneCounts option.
I think the first thing to do is to plot log input concentration and log read counts for ERCC counts? Do I also find ratio between samples? Since our samples include the same amount of ERCC mix 1, the fold change between the two samples will always be 1 for any ercc transcript given that the experiment was done well. This does not tell me much about how to use ERCC for quality control, besides checking whether or not the sequencing was done right.
What do I do next to find out if I should exclude any of my RNA-seq counts below or above certain thresholds?

Thanks

RNA-Seq ERCC Spike-in • 4.3k views

ADD COMMENT • link updated 5.7 years ago by Charles Warden 8.3k • written 5.7 years ago by anikb ▴ 40

score 1 · Answer 1 · 2019-03-08

1

Entering edit mode

5.7 years ago

Charles Warden 8.3k

To be honest, I think there is a decent chance you may be better off not using the ERCC spike-ins for analysis (or at least consider that as an option).

For example, there was some discussion on this matter in this thread:

A: How does edgeR do ERCC spike-in normalization?

ADD COMMENT • link 5.7 years ago by Charles Warden 8.3k

1

Entering edit mode

Thank you, Charles, for your reply! I had read that post of yours earlier, and I am actually taking those tips into consideration for my DEG analysis. Since I already have ERCC spiked-in data, I also wanted to compute DEG with ERCC into account and see what the differences would be in final results.

ADD REPLY • link 5.7 years ago by anikb ▴ 40

0

Entering edit mode

I think that is fair - as long as you are testing the effect with and without extra normalization (and seeing if there any fundamental changes in functional enrichment that would or would not be reasonable with what you know about the rest of your experiment), I think that is what is most important.

Good luck!

ADD REPLY • link 5.7 years ago by Charles Warden 8.3k

score 0 · Answer 2 · 2019-03-07

0

Entering edit mode

5.7 years ago

goodez ▴ 640

I recommend this package. I've used it for samples with spike-ins. Check out section 2.2 for an example where they're using ERCC spike-ins.

https://bioconductor.org/packages/release/bioc/vignettes/RUVSeq/inst/doc/RUVSeq.pdf

EDIT - I know your question was to avoid using a package, but it seems like you were just having problems with one specific package.

ADD COMMENT • link 5.7 years ago by goodez ▴ 640

0

Entering edit mode

Thanks for the recommendation! How did you get to install the RUVSeq package? I keep getting non-zero status for ‘Matrix’, ‘RcppArmadillo’, ‘rgl’ packages. I updated R/Rstudio to the latest version (because that's what it was complaining about , initially). Still not working:

configure: error: X11 not found but required, configure aborted. install.packages(update[instlib == l, "Package"], l, repos = repos, : installation of package ‘Matrix’ had non-zero exit status

Or may be I should take this question to Bioconductor.

ADD REPLY • link 5.7 years ago by anikb ▴ 40