I have 2 gene lists with 13 members, called A and B. The list A, includes main genes which I am sure about their functionalities, whereas the list B includes genes which I aim to study them.
I would like to do pathway analysis for list B to find how much the list B is similar to list A. So, I developed two approaches but I am not sure if they are correct and meaningful. Hence, I decided to write them here to know your opinion about them and get help from you guys.
First Approach:
1) Do pathway enrichment analysis for each list of genes distinctly as follows:
for list X and pathway Y, we determine:
a = the number of common genes between list X and pathway Y
b = the number of genes in pathway Y
c = The number of genes in whole pathways - b
d = the number of genes in list X
p-value = dhyper(a, b, c, d)
2) For each list A and B, find pathways with p-value < 0.05, called pathway_A
and pathway_B
.
3) Find the common pathways between pathway_A
and pathway_B
.
Finally, the list B is functionally related to list A, if there is a common pathway in step (3).
Second approach:
1) Find pathways which include some genes from list A, called pathway_A
.
2) Find pathways which include some genes from list B, called pathway_B
.
3) Find common pathways between pathway_A
and pathway_B
, called pathway_AB
.
4) For each pathways in pathway_AB
,
4-1) repeat the below instructions for 100 times:
4-1-1) Select 13 genes randomly from all genes in all pathways, called randGenes
.
4-1-2) Find pathways which include some genes from randGenes
, called randPathway
.
4-2) Find the number of times which the interested pathway, selected in step (4), is included in randPathway
, called n_rand
.
4-3) p-value = n_rand/100.
We do step (4-1) to (4-3) for all pathways in pathway_AB
, so we get p-value for each of them. Finally, the list B is functionally related to list A, if there is a pathway with p-value < 0.05.
Thank you!
Why do you only have 13 genes to start with for each A and B? Secondly, you don't seem account for genes which take place in more than one pathway, or have overlap.
Just to follow up, here is a good paper which addresses crosstalk, or the overlap I spoke to above: http://m.genome.cshlp.org/content/23/11/1885.short