Hello,
I would to predict the Transcription Factor Binding Sites of ~20,000 genes upstream 1000bp sequences. And I want to do this search based on matrices got from JASPAR (Archive.zip file).
For example:
>MA0001.1 AGL3
A [ 0 3 79 40 66 48 65 11 65 0 ]
C [94 75 4 3 1 2 5 2 3 3 ]
G [ 1 0 3 4 1 0 5 3 28 88 ]
T [ 2 19 11 50 29 47 22 81 1 6 ]
>MA0003.1 TFAP2A
A [ 0 0 0 22 19 55 53 19 9 ]
C [ 0 185 185 71 57 44 30 16 78 ]
G [185 0 0 46 61 67 91 137 79 ]
T [ 0 0 0 46 48 19 11 13 19 ]
>MA0004.1 Arnt
A [ 4 19 0 0 0 0 ]
C [16 0 20 0 0 0 ]
G [ 0 1 0 20 0 20 ]
T [ 0 0 0 0 20 0 ]
Thank the information from this link: http://biostar.stackexchange.com/questions/6436/transcription-factor-binding-site-prediction
But I still do not have a suitable tool to scan the set of sequences, because I use the PC windows XP system.
So, can you recommend some tools for me? or the other methods to do this work?
Thanks!
I found Cluster-Buster a third generation program for finding clusters of pre-specified motifs in nucleotide sequences. But perhaps it cannot completely output what I want. It outputs the clusters of motifs.