Hi all,
I am looking for the best way to normalize RNA-Seq data generated using the 3’prime technology of Lexogen. Due to this technology, I do not need to include gene length.
However, the data should follow a normal distribution, as I will proceed with linear modeling in matrixeQTL.
I've read quite a lot about RNASeq normalization, but I feel like there are many options and opinions and at the same time not so much info about my concrete situation. For example, here https://pubmed.ncbi.nlm.nih.gov/31861985/ TPM seems to perform best (but I do not need to account for gene length), while log10 was evaluated as not good regarding biological signal.
Starting from raw counts, I see the options now to start either with
- DESeq / TMM / quantile
to normalize between samples.
But then I think I need further transformation to make the data "normal". For example
- log, vst or inverse rank transformation
I would be thankful for any advice from you!
I found some advice on matrixeQTL faq:
If you go to the "Outliers in expression. Quantile normalization and Kruskal-Wallis test." paragraph, they also suggest some R code for the normalization.