Question

Gene lists from RNA Seq and Chip Seq

0

Entering edit mode

10.5 years ago

will.schachterle ▴ 30

I have a list of genes that are up or downregulated after treatment, and another list of genes that are bound by a transcription factor. I want to know if the percentage of genes in the list of up/downregulated genes that also appears in the bound gene list is significant. I'm told that a Fisher's exact test is appropriate here, but I'm not sure how to do this.

So, I have 132 genes that are upregulated, 1557 genes bound. Of those 132 upregulated genes, 24 appear on both lists. I guess another way of asking this, is, how much of that matching would appear by chance?

I guess this is really a basic statistics question, so I'd love an answer and explanation that didn't include a lot of code. I want to know what to do and why that's appropriate.

RNA-Seq Fishers-exact-test ChIP-Seq • 3.0k views

ADD COMMENT • link updated 3.2 years ago by Ram 45k • written 10.5 years ago by will.schachterle ▴ 30

0

Entering edit mode

How many genes are downregulated? Do you want to compare if there is an enrichment of TF bound genes in Upregulated gene set versus Downregulated gene set?

ADD REPLY • link 10.5 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

I'd like to split it up, so what are that chances that that those 24 upregulated genes happen to appear on the bound list, and 11 downregulated genes (there are 109 total downregulated genes) happen appear on bound list.

ADD REPLY • link 10.5 years ago by will.schachterle ▴ 30

Ram · Answer 1 · 2014-09-30

So if you want to do the test separately for Upregulated and Downregulated Genes, this is how you can do for Upregulated genes:

Upreg.Genes = 132
Upreg.TF.Bound.Genes = 24
Upreg.TF.UnBound.Genes = 108

Total.DiffExpr.Genes = 241 (132+109)
Total.DiffExpr.TF.Bound.Genes = 35 (24+11)
Total.DiffExpr.TF.UnBound.Genes = 206

mat = matrix(c(Upreg.TF.Bound.Genes, Total.DiffExpr.TF.Bound.Genes, Upreg.TF.UnBound.Genes, Total.DiffExpr.TF.UnBound.Genes),nrow = 2,dimnames =list(c("Upreg.Genes", "Total.DiffExpr.Genes"),c("TF.Bound", "TF.Unbound")))

# this is how your matrix will look like

                     TF.Bound TF.Unbound
Upreg.Genes                24        108
Total.DiffExpr.Genes       35        206

ftest = fisher.test(mat, alternative = "greater") #to check if TF bound genes are enriched in the upregulated set as compared to the entire diff. expressed gene set

pvalue = ftest$p.value # pvalue. If less than 0.05 then TF bound genes are significantly enriched in your upregulated gene set, else insignificant

estimate = ftest$estimate # estimate/odds ratio

Similarly, you can do the test for Downregulated genes.