Entering edit mode
2.6 years ago
ilovesuperheroes1993
▴
40
I would like to run a transcription factor (TF) enrichment analysis using SEA of meme-suite. I have around 2000 query sequences in which I would like to find the enrichment. I have two questions regarding this process:
- Should I use shuffled query sequences as control or is it better to provide random DNA sequences as control? To select random sequences, I have generated random genomic coordinates of lengths identical to my query sequences. Is this method correct?
- As for the TF motifs, I have used the mononucleotide models (full) from HOCOMOCO database. I wanted to use the dinucleotide models, which are supposed to be more accurate, but only the mononucleotide models have the motifs in meme format.
Any inputs would be appreciated. Thanks
what process produced the 2000 sequences in the first place?
The 2000 query sequences are ChIP Sequencing peaks. I didn't mention it in my original post as it did not seem important to what I was asking.
with more info about the upstream experiment and analysis that generated these peaks one might suggest that you