ClusterProfiler : What is GeneRatio and BgRatio?
3
15
Entering edit mode
8.1 years ago
ZheFrench ▴ 590

Question is in the title.

GeneRatio is like M/N where M is the number of genes from your input list that match the GO term. But I don't see what is N ?

BgRatio is like A/B where B is all genes in database but I'm not sure what A corresponds to ... Is it the number of genes specific in the database of this GO term ?

Tell me if I'm wrong. Thanks.

clusterProfiler • 51k views
ADD COMMENT
27
Entering edit mode
6.8 years ago
molla.linda ▴ 270

I will give an example to explain this that helped me understand it. I also was looking for the answer and Guangchuang link helped.

Let is suppose I have a collection of genesets called : HALLMARK Now let is suppose there is a specific geneset there called: E2F_targets

BgRatio, M/N.

M = size of the geneset (eg size of the E2F_targets); (is the number of genes within that distribution that are annotated (either directly or indirectly) to the node of interest).

N = size of all of the unique genes in the collection of genesets (example the HALLMARK collection); (is the total number of genes in the background distribution (universe)

GeneRatio is k/n.

k = size of the overlap of 'a vector of gene id' you input with the specific geneset (eg E2F_targets), only unique genes; (the number of genes within that list n, which are annotated to the node.

n = size of the overlap of 'a vector of gene id' you input with all the members of the collection of genesets (eg the HALLMARK collection),only unique genes; is the size of the list of genes of interest

ADD COMMENT
4
Entering edit mode
ADD COMMENT
1
Entering edit mode

I'm a little confused about these terms.

When I;ve used the same gene set, why do my numbers of n and N change when doing gene ontology for different categories.

For example, for the same gene list for an overrepresentation test in Biological Processes for taxis GeneRatio is 209/3770 and BGRatio is 440/12553 but for Cellular Components for the term extracellular matrix, the Gene Ratio is 162/3963 and Bg Ratio is 339/13183. Shouldn't the n and N values stay the same in different GO categories?

Cheers

ADD REPLY
0
Entering edit mode

Yeah I have the same problem. I don't really understand why the small n is changing then?

ADD REPLY
0
Entering edit mode

I am also struggling with the same problem (i.e. n and N are changing). Have you figured it out?

ADD REPLY
0
Entering edit mode
genes <- letters[1:15]
gs_df <- data.frame("gs_name"=c(rep("genesetX", 10), rep("genesetY", 25)),
                    "entrez_gene"=c(letters[1:10], letters[2:26]))
enricher(gene = genes, TERM2GENE = gs_df, minGSSize=1)@result

               ID Description GeneRatio BgRatio      pvalue    p.adjust       qvalue                      geneID Count
genesetX genesetX    genesetX     10/15   10/26 0.000565352 0.001130704 0.0005951074         a/b/c/d/e/f/g/h/i/j    10
genesetY genesetY    genesetY     14/15   25/26 1.000000000 1.000000000 0.5263157895 b/c/d/e/f/g/h/i/j/k/l/m/n/o    14

GeneRatio = k/n

  • k is the overlap between your genes-of-interest and the geneset
  • n is the number of all unique genes-of-interest

BgRatio=M/N

  • M is the number of genes within each geneset
  • N is the number of all unique genes across all genesets (universe)
ADD REPLY
4
Entering edit mode
2.7 years ago
sarahhp ▴ 40

Or perhaps in simpler terms GeneRatio = genes of interest in the gene set / total genes of interest. Most often I use it on lists of differentially expressed genes and so GeneRatio is also the fraction of differentially expressed genes found in the gene set.

I have struggled to find the right words to explain this to others, so I hope this helps!

ADD COMMENT
0
Entering edit mode

what about BgRatio ?

ADD REPLY

Login before adding your answer.

Traffic: 1715 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6