Dear Biostars community,
I was hoping to get your ideas on how to interpret RNA-seq data in two samples before and after drug treatment, based on the mapping statistics of exogenous spike-in RNA. One sample is untreated. the other sample is treated with a drug to (theoretically) increase transcription.
I am asking this question to have a logical understanding of what the mapping statistics alone can already tell me, (without using packages like DESeq, gene-based normaizations etc.). because there is a lot of literature out there explaining the need for spike-in but not much on how to do it exactly.
For my two (non-human) samples (untreated and treated), i added spike in RNA from human cells to account for ~1% of the total RNA concentration in each sample. these were then sequenced and mapped (mapping done to human genome for the spike -in).
when i look at the mapping statistics for the spike in RNA (using STAR),
i see that, in the untreated sample, my spike-in accounts for 1% of total mapped reads. in the treated sample, my spike-in accounts for ~3.5% of total mapped reads.
What does this tell me about my expression level in the treated vs untreated samples of interest (i.e. about the complexity of the two read libraries)?
Does this mean that there is less expression in my sample of interest after drug treatment, and so more reads come from the spike-in?
or do i have to normalise further (e.g. for library size) before i can interpret these percentages?
Thank you for your help!