Entering edit mode
4.0 years ago
Pappu
★
2.1k
I am wondering what would be the idea way to carry out motif enrichment analysis from ChIP-seq data in order to see that the peaks called are significantly enriched for a TF motif compared to other regions in the genome.
Without more context it is difficult to answer in details, but many people use HOMER for motif finding from ChIP-seq analysis.
In other words, how can one check if the TF is binding to the right places in the genome based on a known TF binding motif from TRANSFAC?
That is not how this goes. Not the motif is the ground truth but the binding, but I see your point. I would simply run Homer (
findMotifs.pl
) on your peaks, the canonical motif should come out as highly enriched. If you really want to know whether the peaks are "correct" in terms of specific, then you would need to perform ChIP with the same antibody on a similar celltype that does not express or has a knockout for that TF.So without using HOMER, u could simply use bedtools getfasta to obtain the genomic sequence around your ChIP-seq peaks, grep your known motif, and see how many of your peaks are containing that motif.
Thanks. I was thinking about MEME-ChIP.
Also possible, people use either Homer or MEME, like they use bwa or bowtie2, they are similar.
I got that. How do you derive a p-value from that if say 30% sequences have that motif?
Depends. If the TF is known to bind promoters, then you could make a fisher exact test:
Number of promoters...
then in R: