Hello,
I have some questions about GO and KEGG gsea in Rstudio:
which should be the input genelist for the function gse? at the moment I am using this:
geneList <- geneList[abs(dataGSEA$log2FoldChange) > 1 & (dataGSEA$padj) < 0.05]
And then I sort the samples
geneList = sort(geneList, decreasing = TRUE)
Is this correct? should I use only the up-regulated ones?
Moreover, I'm still confused about the difference between GSEA and ORA analysis (from a practical point of view):
- when should I use ORA instead of GSEA? Should I use both?
- for the enrich function should I filter the gene (by abs log2foldchange and padj) like for the gse analysis?
And last question:
- which should be the input list for the fgsea function and it should be sort?
Thank you very much in advance
I red the guide before post and after your comment, but I swear that for me is not clear if I have to apply any filtering steps of my gene list before perform ORA and/or GSEA
for ORA yes, you need to select a subset of genes of interest. Some examples include statistically significant upregulated genes or a cluster of co-expressed genes.
For GSEA you do not filter your ranked gene list though there may be some pre-filtering of lowly expressed genes before doing something like a differential expression test for example.
Thank you very much!!
Can you explain me better the pre-filtering step?
https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#pre-filtering-the-dataset
if my answer was helpful please upvote and accept.