As a beginner in bioinformatics, I'm currently using R to conduct RNA-seq analysis. After performing DESeq2 analysis, I executed the following code for GSEA analysis:
res <- as.data.frame(counts(dds, normalized=TRUE))
res <- results(dds, contrast = c("condition", "exp", "cont"))
res <- as.data.frame(res)
res <- res[order(res$padj),]
res <- res[order(res$log2FoldChange, decreasing = TRUE),]
gse_bp <- gseGO(gene_list, ont = "BP", keyType = "ENSEMBL", OrgDb = "org.Hs.eg.db", eps = 1e-300)
This resulted in a file with columns for ID, Description, setSize, enrichmentScore, NES, pvalue, p.adjust, qvalue, rank, leading_edge, and core_enrichment. I have several questions regarding the interpretation and further analysis of these results:
How do we determine the cutoff in GSEA analysis? Is using an adjusted p-value (padj) of 0.05 as the cutoff appropriate?
Is the q-value equivalent to the False Discovery Rate (FDR)?
For the Normalized Enrichment Score (NES), what criteria are used to decide if a gene set is considered up-regulated?
What standards are applied when ranking in GSEA analysis? For instance, is there a formula like "-padj x NES" used for this purpose?
Thank you very much for shedding light on these matters.