Hi everyone,
I'm doing some work with DNA motif discovery - especially with TF binding sites (usually sigma factors which have dual motif (-10 and -35 site)).
In our case the sigma factors have known protein sequence, with known conserved domains etc.
I was wondering if there is some way - or some software - to reinforce the motif search. Or what sort of knowledge would be necessary to do something like this. I can imagine having a database for interacting AA - Nucleotide database and having a 3d structure / model of the protein (extracting the interacting AA) and number of nucleotide sequences (extracting poteintial motif sites) - that this could lead to improvement in motif discovery/binding.
Taking the sigma factors into account - one could cluster the motifs in promoter sequendes based on how well they fit the protein.
Does anyone know if something like this was done anywhere? What approach would you use?
Best regards Marek