Hello
How can I determine genes regulated by a protein?
I tried to find intersection of peaks(MACS) and genes coordinates but as a result I have several hundred genes while there should be only several ones.
Hello
How can I determine genes regulated by a protein?
I tried to find intersection of peaks(MACS) and genes coordinates but as a result I have several hundred genes while there should be only several ones.
This ends up not being a bioinformatics question, but I'm guessing you don't know that (otherwise, you wouldn't have asked). Presuming that this is a transcription factor, you need to either over/under express the protein (better yet, do both to avoid ceiling/floor effects) and then do RNAseq. Your genes or interest are only those with peaks in their promoters, as any other DE genes are likely due to secondary effects.
Edit: Of course, if someone has already done the aforementioned experiment and released the dataset then just download and analyze it.
Edit2: You can also whittle down the candidate list by only looking at genes meaningfully expressed in your tissue of interest. If you're lucky, perhaps that will limit the list to the size you're expecting (I'll not bother to question why you have a certain target number in mind).
binding does not infer functionality.
Several papers have shown that changes of adjacent TF binding poorly correlates with gene expression change:
On average, 14.7% of genes bound by a factor were differentially expressed following the knockdown of that factor, suggesting that most interactions between TF and chromatin do not result in measurable changes in gene expression levels of putative target genes.
To assign functional binding sites to their target genes, one may consider to use BETA developed in Shirely Liu's lab. It integrates ChIP-seq data and gene expression data to infer the TF target genes.
You may be also interested in this paper:
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you a lot! Very helpful answer for me. As you guessed I didn't know that it is not a bioinformatic question) I will contact biologists.
Taken from a slightly different angle, I think this is very much a bioinformatics question. Consider re-evaluating the question as: Given a MACS data set, what can be done bioinformatically to assess a gene-regulatory relationship? Devon has already suggested various answers involving bioinformatics: (1) find and download a related experiment (bioinformatics), (2) analyze it (potentially lots and lots of bioinformatics), (3) filter your current data set based on some criteria (bioinformatics), (4) search for motifs (bioinformatics), etc., etc. As for myself, I'm exceedingly curious as to why you expect "only a few" sites given that many gene regulatory proteins have hundreds and hundreds of binding sites. What organism? What conditions? What kind of protein?