I would like to do functional gene set enrichment analysis in clusterProfiler via gseGO and gseKEGG. These function needs a list of genes, which I'm planning to rank by log fold change. Should the gene list contain all genes, or should it just contain genes below a significance cut off (e.g. padj < 0.05)?
My previous answer was due to a misunderstanding, so I deleted it to avoid possible mistakes. It would be best to use all of the genes in the dataset as the input of GSEA analysis. The GSEA analysis contains DE analysis by itself.
Its common to set only the significant genes (padj < 0.05) and separate them between upregulated and downpregulated genes (logFC >1.0 or logFC <-1.0)