How do you use ERCC spike-ins for ALR Transformation of RNA-seq data?

0

Entering edit mode

2.9 years ago

O.rka ▴ 740

I finally got my hands on a dataset with properly designed ERCC92 spike ins. The question is, how should I use these with ALR in theory?

The additive log-ratio transformation (alr), which allows the user to scale their data by a feature with an a priori known fixed abundance, such as a house-keeping gene or an experimentally fixed variable (e.g., a ThermoFisher ERCC synthetic RNA “spike-in”15), may provide a superior alternative. In contrast to clr, proportionality calculated with alr does not change with missing feature data because it effectively back-calculates the absolute feature abundance.

https://www.nature.com/articles/s41598-017-16520-0

Do I use a single ERCC92 feature as the reference, the summation, or the mean?
Do I include all or only a select few if it's the latter 2 options?
Should I scale all the datasets so their ERCC92 spike counts are the same before transformation? (This will likely result in the same data, though I'm thinking out loud and haven't tested)

fastq genomics rnaseq • 463 views

ADD COMMENT • link 2.9 years ago by O.rka ▴ 740

Login before adding your answer.