Question

Transcription factor binding site prediction

0

Entering edit mode

5.1 years ago

shwetamgr1 ▴ 10

how to predict Transcription factor binding site for any crop when we have gene sequence, genome sequence, promoter sequence. And also how to find the distribution of that particular transcription factor over all genome using above information??

gene R genome alignment sequence • 1.7k views

ADD COMMENT • link updated 5.1 years ago by Alex Reynolds 35k • written 5.1 years ago by shwetamgr1 ▴ 10

2

Entering edit mode

what have you tried so far? If you look around (google) you should be able to find some decent starting points for this.

(additionally, why did you add R as a tag to your question? Do you want an R based solution?)

ADD REPLY • link 5.1 years ago by lieven.sterck 15k

0

Entering edit mode

yes i was using "BiocManager" package in R, just to see if this will be helpful, and other than R i have seen JASPAR and MEMESuite but i am not able to do. I just want to know if there is any other option that will be possible.

ADD REPLY • link 5.1 years ago by shwetamgr1 ▴ 10

0

Entering edit mode

OK, can you elaborate why JAPSAR and/or MEME are not working for you?

ADD REPLY • link 5.1 years ago by lieven.sterck 15k

0

Entering edit mode

my crop is Vigna spp which is i have not found in JASPAR

ADD REPLY • link 5.1 years ago by shwetamgr1 ▴ 10

0

Entering edit mode

Could you review approaches here: http://planttfdb.cbi.pku.edu.cn/index.php?sp=Vun and here: https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-017-4306-1 for TF discovery? You might even contact authors for advice and code, to set up a collaboration to run things for your particular species.

ADD REPLY • link 5.1 years ago by Alex Reynolds 35k

0

Entering edit mode

thanku soo much i think this is working

ADD REPLY • link 5.1 years ago by shwetamgr1 ▴ 10

score 4 · Answer 1 · 2019-10-31

4

Entering edit mode

5.1 years ago

Alex Reynolds 35k

You could run FIMO on your entire genome for TFs (transcription factors; DNA-binding proteins) of interest, which gives you binding sites: genomic intervals where those TFs bind. Here's a post where I wrote up step-by-step instructions:

https://bioinformatics.stackexchange.com/a/2491/776

Once you have TF binding sites, you can intersect them with gene promoters or other annotations easily with BEDOPS bedops etc.

ADD COMMENT • link 5.1 years ago by Alex Reynolds 35k

0

Entering edit mode

I suggest you first select the genomic targets you want and then run fimo. Doing so you will reduce the genomic regions to be scanned from 100% (= entire genome) to like < 1% (= all promoters) which will massively decrease the multiple testing burden.

ADD REPLY • link 5.1 years ago by ATpoint 85k

0

Entering edit mode

Depends on your experiment, I guess. A reference genome rarely changes, but targets could easily do so. You only have to scan a reference genome once to build a "database" of binding sites. If your targets change, all you have to do is scan with bedops etc. against the reference binding sites, which can take seconds, while re-running FIMO could take considerably more time.

ADD REPLY • link 5.1 years ago by Alex Reynolds 35k

0

Entering edit mode

But without TF models from JASPAR or other sources, this is kind of a useless answer. Hopefully the OP can get in touch with some plant-specific TF specialists who can help him or her get the TF models needed to do this (on whatever scale).

ADD REPLY • link 5.1 years ago by Alex Reynolds 35k

0

Entering edit mode

I would probably give the Plant PFMs a try from JASPAR: http://jaspar.genereg.net/downloads/ ReMap now also contains Arabidopsis data ( http://remap.univ-amu.fr ) maybe the TF of interest is in there and one might extract useful information or at least build PFMs from this.

ADD REPLY • link 5.1 years ago by ATpoint 85k