In many chip-seq transcription factor binding site locations, I found a presence of both the forward and reverse motif in a binding site (from ENCODE data).
For example, for the GATA1 transcription factor, the following 100 length sequence was identified from ENCODE as a positive binding site. From the CIS-BP database, the forward strand motif is (roughly) GATA, and the reverse strand motif is TATC. In the sequence below, which is the forward strand, you can see that it contains both GATA and TATC.
GGGGTCGGGGGAGGTGCAATCTCCATTCTTGTGAAGGCCAGATAAAGACTTCTGCTGCAAGCCACTATCTTTTTGTGCTGAAGCTGTAGATAACACAGTT
I have noticed this for many positive sequences for many different TFs. In addition, I found about an equal number of forward and reverse motifs in all of the positive sites for many TFs. I was wondering if anyone had any insight as to why that might be?
Thanks
More generally, you may find some TFs that have palindromic binding sites, so that the factor may bind on either side of the strand, perhaps located to some regulatory element of the target gene.
Could it be where Zinc fingers of the GATA-1 factor bind to opposite strands of its promoter? http://www.ncbi.nlm.nih.gov/pmc/articles/PMC231211/pdf/162238.pdf
Many TFs bind DNA as homodimer, which in general explains this observation. However, this requires the binding motif to be palindromic. This is not true for your example here, but you could check the others.