Hello everyone,
First, thank you for any help, I am not a bioinformatician so pardon my amateur question.
I read a paper (cited below) that identified a ~28 nt motif that varies (but has a conserved 5-GTGG-3, and adjacent 5-T-A-3 region). This motif is interesting because it forms a stem-loop that causes rapid depurination of the G residue.
I have a set of ~1,100 related genes (file is 1.2mb) and I want to look to see if this motif is enriched in this family of genes compared to the rest of the genome.
So far I've explored a couple options including MEME suite, but my data set is too large. Does anyone have any other ideas? This doesn't seem to be a common analysis.
I should add that multiple programs that I have found can do a differential analysis for known motifs (e.g.GO, PANTHER) but I have been unable to find this motif in the classification system of known motifs. Thus, I think I will need to use a .fa, or .txt of the sequence (I think?).
Thanks!