Entering edit mode
21 months ago
sunyeping
▴
110
Hello all,
Do you know any database or software that can help define the positions and sequences of promoters of a given transcriptional factor for a given gene?
Best regards
Why is it so difficult to provide proper information? Presumably you know that finding promoters in prokaryotes and eukaryotes are two completely different problems. The former is relatively well understood and there are databases, while the latter faces significant challenges.
It is easy to find a good deal of info on this subject by Googling.
Thank you for your response. I would like to identify promoters of eukaryote genes, for example, the promoters of Itga1 and Itgae genes. What is the acceptable software or database to do this and what are the procedures? I am trying to google still get no satisfying answers.
regards.
Please describe in more detail what kind of input you have and which organism you are working with.
As a rule of thumb, everything between -500bp and 100bp around a transcription start site is typically considered the promoter region. For organisms with transcript annotation, you can retrieve the sequences with BioMart or extract with Bedtools or Seqkit from the reference genome fasta.
To define TSS, you can look for published CAGE-seq data (e.g. that of the Fantom consortium), if your organism is eukaryotic, since the method depends on capped RNAs. ChIP-seqs for your transcription factor can be used to run a motif search with Homer. The lab of Aviv Regev has published corresponding ML-models etc...
The organism is mice.
Short answer: you won't find a good 'black box' you can just run for this (at least with any confidence in the output).
Long answer: do an RNAseq experiment, obtain the transcriptional data for your genome, and look for reads mapping to inter genie regions upstream of genes. This will still be a 'woolly' answer though, as promoter boundaries are not always concrete.
Hello Jeo,
We found that Jun/fos are highly expressed in RNAseq. But to map reads mapping to regions upstream of genes, do you mean ATACseq?
I don't need the exact boundary of the transcriptional factor. I just need to know whether Jun/fos binds to the upstream of a given gene and what are the necessary sequences for the binding. Is there no bioinformatic tool to do this?
As I understand, RNA polymerases for transcription also bind to promoters. Are the promoter sequences for RNA polymerase binding are different from those of transcriptional factors? How to define the promoter sequences for RNA polymerase binding upstream of a given gene?
Thank you.
You can use the UCSC genome browser to get the the sequence of X bases upstream of a gene, then use tools in MEME-suite/JASPAR to look for Jun/Fos motif in that region.
I think what's being conveyed by the other user is that there's the proximal promoter (which the above analysis will capture) but also distal promoters/enhancers that could be thousands and thousands of bases away that will be missed doing this quick type of analysis. You might be better off doing a ChIP-seq experiment or taking advantage of publicly available ChIP-seq data.