Hello all,
I want to use RNA-seq data from the recently released BeatAML 2.0 cohort. This is the second iteration of the dataset, and it has been "harmonized" to accomodate the addition of new RNA-seq samples. Specifically, they have harmonized the data by normalizing the counts using conditional quantile normalization (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3297825/). Presumably this corrects for variability in GC content between samples, however noticeably lacks the inter-sample normalization provided by edgeR or DEseq2.
Seeing how edgeR or DEseq2 ideally utilize raw counts as input, I was wondering if these harmonized counts were suitable. If not, is there an alternate normalization method that would be recommended for this particular situation?
Thanks!
seems like a pretty nuanced issue, I would recommend posting in Bioconductor support where DESeq2/EdgeR developers are very active.