Question

Significance of pathways

0

Entering edit mode

7.5 years ago

cerulean • 0

I have just been assigned to a project that is in the Bioinformatics field which is novel for me and it involves the following:

There is a list of genes. These genes have been associated with pathways derived from the KEGG database. I also have the KEGG genes and their associated pathways. I have to calculate the significant pathways that are present in my dataset. For that, I have to do a hypergeometric test. After that, I have to select the pathways that have p-values less than 0.005.

What is the meaning of choosing the pathways within this cut-off? When I know that the genes in my dataset belong to certain pathways already, why do I need to do a hypergeometric test? Why would it not be enough to just detect the pathways present in my dataset by finding the intersection between my gene set and that of KEGG's?

gene • 1.9k views

ADD COMMENT • link 7.5 years ago by cerulean • 0

3

Entering edit mode

Here is a simple example:

Take one KEGG pathway and list out all the genes participating in that pathway.
Check how many of the genes in your list are overlapping with KEGG pathway genes
Now randomly select genes from KEGG pathway gene list
When you randomly chose genes from KEGG pathway, list how many genes are overlapping with the list you've got in step - 2

If the numbers are too close, there is a high chance that even if you randomly select some genes from a pathway, you'll get some of the genes that are in your list. In order to confirm that the genes present in your list are representing a pathway not by chance but by the condition you are testing.

This is a very helpful guide.

ADD REPLY • link 7.5 years ago by venu 7.1k

1

Entering edit mode

I have just been assigned to a project

Was this not something you wanted to do? In addition to that you are being told what you need to do in detail. So what is the purpose of this exercise. Are you expected to learn something in the process or just complete the task at hand?

Take a look at some helpful GO enrichment analysis materials here. These principles will be applicable in your case as well. Some useful tools are listed in this WikiPedia link.

ADD REPLY • link 7.5 years ago by GenoMax 148k