Cut-off for genesets in GSEA
1
1
Entering edit mode
7.0 years ago

Hi I'm running GSEA on my pre-ranked genes consisting 260 differentially expressed genes following GSEA instructions, but it is obscure for me how to set max and min size for exclude gene set. In GSEA user guide it's noticed that

"defaults that are appropriate for datasets with 10,000 to 20,000 features. To change these default values, use the Max Size and Min Size parameters on the Run GSEA Page..."

Now, I don't know in my case with 260 genes how these values should be set to be appropriate for my data

Thank in advance

GSEA • 6.2k views
ADD COMMENT
2
Entering edit mode

Hi Shamim, the preranked file should contain differential expression scores for all detected genes. This is normally 10 to 20 thousand genes. Do not exclude genes with FDR>0.05

ADD REPLY
0
Entering edit mode

Hi Mark thank you for your answer I Know about cut-off for the DEGs, but precisely my question is about cut-offs for the genesets that must be include or exclude by GSEA , in "Basic field setting" of GSEA there are two options about size of genesets that must be considered in its analysis and I don't know how those options should be set with my gene list , by default Max size: exclude larger sets=500 Min size exclude larger set =15

As GSEA user guide noticed these values are appropriate for datasets with 10,000 to 20,000 features ,I dont have a dataset instead there is a signif gene list and I don't know how the option should be set sincerely

ADD REPLY
0
Entering edit mode

Oh OK, so you want to exclude gene sets with fewer than 10 or 15 members. They are just unreliable.

You shouldn't exclude gene sets because they have more than X members. This GSEA feature is puzzling to me. I set the larger threshold to something really large like 5000. You can count the number of gene set members with a script to see what might be excluded from analysis.

Does that make sense?

ADD REPLY
0
Entering edit mode

Excluding gene sets with fewer that 15 is default of GSEA not my cut-off !! honestly it's still puzzling for me, I think, I should read the GSEA user guide again and consider your notes Thank you Mark

ADD REPLY
1
Entering edit mode
7.0 years ago
alserg ▴ 980

Cut-offs are more of performance parameters. Limiting min and max set sizes you limit the time required for the analysis. For large gene sets estimation of is computationally harder compared to smaller gene sets.

ADD COMMENT

Login before adding your answer.

Traffic: 1709 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6