Hypergeometric test - Defining the gene universe
1
0
Entering edit mode
7.2 years ago
RossCampbell ▴ 140

When doing a hypergeometric test for pathway enrichment, is there a generalized accepted way of defining the total "gene universe". I am debating two possible numbers: 1) the number of probes on the microarray that was used to generate the data in the first place, and 2) the total number of genes from the model organism used. Any thoughts on the most appropriate approach?

Gene set enrichment hypergeometric test • 2.7k views
ADD COMMENT
1
Entering edit mode

I would recommend you to consider the total genes that are detected atleast in one of your sample (microarray/RNA-seq).

ADD REPLY
3
Entering edit mode
7.2 years ago
Renesh ★ 2.2k

In enrichment analysis, using a right background database is very critical for statistical analysis. The differences in gene background definitely affect your statistical significance (P-values) and ultimately biological inference.

If you use all genes from the genome, it will give highly significance P-values. Instead, if you use, only genes that define all of your pathway categories will give more robust and reliable results.

So if you are using a microarray for your analysis, then only use the genes that are represented on microarray chip as your background. It is recommended to not use all genes from the whole genome as reference background as it will give you more significant P-values.

ADD COMMENT

Login before adding your answer.

Traffic: 2450 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6