Confused With Peaks And Motifs
3
2
Entering edit mode
11.2 years ago
ChIP ▴ 600

Hi!

I am really confused, and need some expert advice from the guys who work in NGS data analysis along with reason.

Ok, the PROBLEM is:

I performed peak calling using MACS and I got nice peaks(antibody used for detecting ETS factor), then I annotated these peaks to the single nearest genes in a window of 25 Kb. This was followed by De novo motif discovery and also I scanned these regions using for motifs of interest using PWM. Now, I get a motif for ETS factor through De novo descovery and this motif is also pop ups while scanning the region using PWM (in 85% of peaks this motif is present). Lastly, if I look at the annotated genes and querry for the genes having this motif, I get a list of 220 genes.

My QUESTION:

Are these 220 genes being directly or indirectly regulated by these peaks for sure compared to other genes and these might be more interesting than the others? if yes, how will you back it up through bioinformatic and biological angle. what more you would do in this scenario?

Thank you

chip-seq • 3.8k views
ADD COMMENT
2
Entering edit mode
11.2 years ago

No, you can not be sure that the genes are being regulated by TFs that gave rise to these peaks. Binding does not equal functional binding. However, it could still be significant. You need some sort of background set to compare to, as ETS-like motifs are typically quite common especially in promoter-proximal regions. It depends a bit on how you did the de novo motif disovery, I suppose. If you did that using some suitable background set, the results would be more reliable. Then you could claim on overrepresentation of this motif compared to a suitable background set (perhaps a set of regions with a distance distribution to closest gene similar to yours, but to different genes).

ADD COMMENT
0
Entering edit mode

Background sets were used in both cases of motif discovery (De novo as well as while scanning). Please give your take now.

ADD REPLY
0
Entering edit mode

As Ian wrote in a different answer, it would be helpful if you had access to relevant expression data, microarray or RNA-seq. There is tool, Rcade (http://www.bioconductor.org/packages/2.11/bioc/html/Rcade.html), that attempts to connect ChIP peaks to gene expression using a probabilistic model. (Of course it is also possible to do more straightforward versions of this analysis yourself.) You could use a tool like GREAT mentioned by Ido or the ChIP-seq significance tool (http://encodeqt.stanford.edu/hyper/) to check for correlations with ENCODE data. Perhaps you could search for other motifs that are colocated with your ETS peaks; these could be binding sites for potential interacting TFs.

ADD REPLY
1
Entering edit mode
11.2 years ago
Ido Tamir 5.2k

A starting point for hypothesis generation about common function of the genes could be the GREAT tool http://bejerano.stanford.edu/great/public/html/.

ADD COMMENT
1
Entering edit mode
11.2 years ago
Ian 6.1k

If you had accompanying microarray or RNA-seq expression data you could highlight genes that potentially regulated.

You may find this Biostar link helpful in this regard: How to assign a ChIP-chip/ChIP-seq peak to a target gene?

ADD COMMENT

Login before adding your answer.

Traffic: 1478 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6