Gene lists from RNA Seq and Chip Seq
1
0
Entering edit mode
10.2 years ago

I have a list of genes that are up or downregulated after treatment, and another list of genes that are bound by a transcription factor. I want to know if the percentage of genes in the list of up/downregulated genes that also appears in the bound gene list is significant. I'm told that a Fisher's exact test is appropriate here, but I'm not sure how to do this.

So, I have 132 genes that are upregulated, 1557 genes bound. Of those 132 upregulated genes, 24 appear on both lists. I guess another way of asking this, is, how much of that matching would appear by chance?

I guess this is really a basic statistics question, so I'd love an answer and explanation that didn't include a lot of code. I want to know what to do and why that's appropriate.

RNA-Seq Fishers-exact-test ChIP-Seq • 2.8k views
ADD COMMENT
0
Entering edit mode

How many genes are downregulated? Do you want to compare if there is an enrichment of TF bound genes in Upregulated gene set versus Downregulated gene set?

ADD REPLY
0
Entering edit mode

I'd like to split it up, so what are that chances that that those 24 upregulated genes happen to appear on the bound list, and 11 downregulated genes (there are 109 total downregulated genes) happen appear on bound list.

ADD REPLY
1
Entering edit mode
10.2 years ago
komal.rathi ★ 4.1k

So if you want to do the test separately for Upregulated and Downregulated Genes, this is how you can do for Upregulated genes:

Upreg.Genes = 132
Upreg.TF.Bound.Genes = 24
Upreg.TF.UnBound.Genes = 108

Total.DiffExpr.Genes = 241 (132+109)
Total.DiffExpr.TF.Bound.Genes = 35 (24+11)
Total.DiffExpr.TF.UnBound.Genes = 206

mat = matrix(c(Upreg.TF.Bound.Genes, Total.DiffExpr.TF.Bound.Genes, Upreg.TF.UnBound.Genes, Total.DiffExpr.TF.UnBound.Genes),nrow = 2,dimnames =list(c("Upreg.Genes", "Total.DiffExpr.Genes"),c("TF.Bound", "TF.Unbound")))

# this is how your matrix will look like

                     TF.Bound TF.Unbound
Upreg.Genes                24        108
Total.DiffExpr.Genes       35        206

ftest = fisher.test(mat, alternative = "greater") #to check if TF bound genes are enriched in the upregulated set as compared to the entire diff. expressed gene set

pvalue = ftest$p.value # pvalue. If less than 0.05 then TF bound genes are significantly enriched in your upregulated gene set, else insignificant

estimate = ftest$estimate # estimate/odds ratio

Similarly, you can do the test for Downregulated genes.

ADD COMMENT
0
Entering edit mode

You might be looking for a hypergeometric distribution test. This will tell you whether the overlap between up-regulated gene and TF bound genes is significant, or whether it is what you would expect at random.

phyper(Upreg.TF.Bound.Genes, Upreg.Genes, totalNumberGenes-Upreg.Genes, TF.Bound.Genes, lower.tail=FALSE)
ADD REPLY

Login before adding your answer.

Traffic: 1922 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6