Hello I want to perform WGCNA or Spearman between multiple groups in TCGA pancancer data.
I am currently using Xena browser to download data and cBioPortal for mutation examination. I am undecided whether I should use "tcga_RSEM_gene_tpm" which is already TPM normalized, and contains 60k features but probably without batch correction and it will be difficult "to correct" with so many features.. or I should use "EB++AdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.xena" which is already after batch effect removal, however, contains much fewer features and will need an additional normalization.
I appreciate any help you can provide
Thank you
Yes, of course, it's just very convenient and fast, much faster than downloading from publications or GDC itself It's good to save some time. As long as I acknowledge the way they did to get that kind of data, why shouldn't I use Xena or cBioPortal?
Just a small note about this method. After performing multiple tasks, I would need to intersect the results (the genes) with 2 additional data frames, both of them TPM/FPKM normalized. If I understand correctly, this one can't be normalized to this mode unless I use raw counts with each transcript length to do so, am I correct?