How To Establish A Correlation Between Gene Expression Time-Series And Chip-Chip Data For The Same Time Points?
3
2
Entering edit mode
13.7 years ago
Rayna ▴ 280

Hi,

As the title suggests it, I'd like to correlate two different types of datasets: gene expression and ChIP-chip. Both are time-series, the same organism (E. coli). I don't have a clear idea how to do this, so any suggestion is warmly welcome!

Thanks a lot.

chip-seq gene microarray correlation • 5.0k views
ADD COMMENT
3
Entering edit mode
13.7 years ago

Start with simple things first.

For example look for correlations between the binding of certain factors and the expression of the genes that are regulated by them. Count all the binding events (x) in a promoter region, then compute the average expression for the regulated genes (y). Do this for each timestep. Now you have two vectors x and y with equal number of values, what is their functional form is, do they correlate at all?

(there will be a fair amount of data shuffling/filtering involved)

With this you can quickly check that your data works at all and that you do indeed have all you need. From this you can then expand on.

ADD COMMENT
2
Entering edit mode
13.7 years ago
Ian 6.1k

I have always found the biggest problem of comparing ChIP-chip/seq binding regions with expression data is the use of gene symbols, i.e. whether the genes associated with binding regions also represented in the gene expression data (and visa versa).

So you could simply try intersecting ChIP-chip binding region coordinates (+/-) a threshold of your choice (say 50 or 100kb) with the probeset coordinates from the gene expression data.

For a more in depth and modelled approach i would follow Casey's suggestion.

ADD COMMENT
0
Entering edit mode

Ian brings up an important point that is necessary for the analysis you are interested in doing, how to assign chip hits to target genes: How To Assign A Chip-Chip/Chip-Seq Peak To A Target Gene?

ADD REPLY
0
Entering edit mode

Thanks a lot for your ideas and very useful links! I'll check these approaches and give you some feedback :)

Regarding the peak definition, I've come up with a way to do it. I'll answer in the discussion pointed out by Casey.

ADD REPLY
1
Entering edit mode
13.7 years ago

Rattray, Lawrence, and Sanguinetti have been doing interesting work in this area. You may need to look around to see which of the various methods they have developed suit your needs, but you can try TFinfer for a start, since it is designed for E. coli.

ADD COMMENT

Login before adding your answer.

Traffic: 2193 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6