Hello! I'm analyzing the hepatocellular carcinoma data from TCGA. I have downloaded it through the package TCGAbiolinks, then normalized it and performed a differential expression analysis using the edgeR method which is included in the TCGAbiolinks package. I now want to perform a GSEA analysis using the GSEA desktop software, in order to see which pathways and GO terms are enriched. I have two options: I can either plug the normalized data with each column being a sample and provide the phenotype file, or I could do a preranked GSEA providing the list of differentially expressed genes ranked by logFC.
I'm trying to figure out what is more convenient or more statistically accurate. I read here that preranked is ideal for when you have less than 3 samples per phenotype but it's not the case here (50 normal and 373 tumor). So is there one I should pick over the other? My instinct tells me the regular analysis is better because the preranked is biased by the previous differential expression analysis but I'm kinda just guessing on that. I'm open to any advice.
Thanks in advance!
Thanks! I just wanted to see which pathways were increased in tumor samples and which in normal samples but I guess that I could just do both as you said.