transcription binding site prediction from fasta sequence
1
0
Entering edit mode
2.3 years ago
newbio • 0

Hi all I have 2 kb long sequence which is promoter region of a gene. From that sequence I would like to pinpoint transcription factor binding sites of smad3. How can I do that and is it possible with JASPAR?

Thanks

transcription site jaspar • 1.0k views
ADD COMMENT
0
Entering edit mode

Hi, you can use python to find all the smad3 binding sites in your query sequence, for example if smad3 binding site is 5'-GTCTAGAC-3' you can use the following dirty python code to get all your sites

#usr/bin/env python3
from re import finditer
query="your 2 kb DNA sequence"
smad3_binding_site="GTCTAGAC"
for matches in finditer(smad3_binding_site,query):
    print(matches.span(), matches.group())

hope it helps.

ADD REPLY
3
Entering edit mode
2.3 years ago

JASPAR allows exporting motifs in MEME format which you can use along with FIMO from the MEME suite to look for specific motif occurrences in your sequence(s). As opposed to exact string matching it takes into consideration the probability of each base occurring per-position since chromatin binding proteins can be promiscuous with their binding sites.

ADD COMMENT
1
Entering edit mode

There is now also a wrapper package for the MEME suite (that includes FIMO) in R/Bioconductor: https://bioconductor.org/packages/release/bioc/html/memes.html You could use the HOCOMOCO motif collection, they offer downloads of motifs directly in MEME format.

ADD REPLY

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6