Question

Advice for gene overlap with direction between two sets.

0

Entering edit mode

5.6 years ago

simplitia ▴ 130

Hi so I have a scenario where I have 200 samples. Each samples I test for 10000 genes. I have two events, call it A and B. Whereby for each sample there can be x genes with gene.up and gene.down. What I want to do is compare if events A and B are similar: I want to see if the intersection is significant. To visualize this I do a venn diagram and see if the intersection is significant. Normally I will do a fisher exact test or hypergeometric test. However its strange here because I have to account for sample, direction and gene. So its coded like this. Sample1.gene.up only this will be consider a match. My question is what then is the total population. For example, if total gene was 1000 is the total population then, 1000 * 2 * n samples. The two because gene can be up or down. Finally it would look something like this. I'm using R.

q = length ( intersect ) 
m= length( n1 )
k= length(n2)
n= 1000 * 2 * total.sample - m


phyper(q,m,n,k,lower.tail=F)

for a fisher test it would look something like this.

total.sample = 200
m =matrix ( c(
    1000 * 2 * total.sample 
    , 400
    , 500
    , 700
)
,nrow=2)

fisher.test ( m , alternative = "greater")

I need advice if I'm doing this correctly? especially if the total population is is correctly calculated? thanks!

R chisquare statistics • 993 views

ADD COMMENT • link 5.6 years ago by simplitia ▴ 130