Or "Everything You Always Wanted to Know About RNA-Seq (But Were Afraid to Ask) Part 2"
In RNA-Seq, it is common practice to compare the abundance of transcripts within the same sample after some form of intrasample normalizations (e.g., TPM) that take into account both transcript length and sequencing depth (although only the former is strictly necessary as long as no other samples are considered). But how reliable are these quantifications really? In particular, I wonder:
- How influential is the GC-bias? Is it important to correct for it?
- How much do biases that may come from the PCR amplification step due to the gene-specific transcription efficiency of the polymerase matter? Can this be accounted for in some way?
- Are there other biases that undermine intrasample comparisons?
Many thanks to everyone who would like to share their experience and opinions!
[ crossposted on Bioconductor: https://support.bioconductor.org/p/9152192/ ]
Maybe check the literature for #1 and #3: https://www.nature.com/articles/srep25533
For #2 UMIs can account for this