Question

Identifying transcription factors for a list of genes

0

Entering edit mode

23 months ago

lbombini • 0

Is there a way to find transcription factors for mouse genes if Ensemble IDs are known?

In an article I'm trying to recreate, transcription factor features were retrieved from GTRD database, which is no longer in service. Here is the study and corresponding citation:

"For transcription factors, we mapped the Gene Transcription Regulatory Database (v18_06)14 to ±200 nucleotides to transcriptional start sites supplied by BioMart for the human reference genome build GRCh38.p12 and the mouse reference genome build GrCm38.p6."

Unfortunately, I wasn't able to find a substitute for GTRD. Any suggestions?

*I have found a list of all known TF for mice in AnimalTFGB V4.0, but no luck how to link them to my list of Ensemble gene IDs

transcription-factor • 2.5k views

ADD COMMENT • link updated 22 months ago by Ram 45k • written 23 months ago by lbombini • 0

1

Entering edit mode

from GTRD database, which is no longer in service

Are you sure? I see it here.

https://gtrd.biouml.org/

ADD REPLY • link 23 months ago by GenoMax 153k

0

Entering edit mode

Yep, it doesn't really work in Ukraine. I tried VPN and it didn't work out

ADD REPLY • link 23 months ago by lbombini • 0

1

Entering edit mode

It also doesn't load for me so you're not going crazy

ADD REPLY • link 23 months ago by Trivas ★ 1.9k

Ram · Answer 1 · 2023-08-28

1

Entering edit mode

23 months ago

Yogi ▴ 70

Funnily enough, I had to do something similar recently. After a good amount of backgrond research, it turns out that this is the best/well-cited resource for this.

Let me know what you think!

AnimalTFDB 4.0

ADD COMMENT • link 23 months ago by Yogi ▴ 70

0

Entering edit mode

Thank you very much for the response! I've came across this resource as well? but the only thing I've found is the list of TFs for mice. Is there a way to "connect" TFs to the genes They regulate?

My final goal is a matrix with genes as rows and TFs as columns, with 1 and 0 describing the interaction

1 - TF regulates the expression of the gene
0 - TF does not interact with a gene

gene Ensemble ID / TF Symbol  Tbx2  Dmtf1  Irx4  Irf3 
ENSMUSG00000000001            1     0      1     1
ENSMUSG00000000003            0     0      0     0
ENSMUSG00000000028            0     1      0     0
ENSMUSG00000000031            0     1      0     0
ENSMUSG00000000037            0     0      0     1

ADD REPLY • link updated 22 months ago by Ram 45k • written 23 months ago by lbombini • 0

0

Entering edit mode

Funnily enough, I was also trying to do this. It seems like you want what I also recently needed.

A mouse-specific all-known-TFs to their targets network.

JASPAR was the only place where I found genome-wide scans for TONS of different TFs.

HOCOMOCO v11 is also another great resource but the problem as compared to JASPAR is that there's a tradeoff. HOCOMOCO is experimental evidence which is biased depending on the tissue as opposed to straight up scanning the entire DNA of the genome.

ADD REPLY • link 23 months ago by Yogi ▴ 70

Ram · Answer 2 · 2023-08-29

You can use an R package called dorothea to access gene regulatory transcription factors. It contains a comprehensive resources of curated collection of TFs and their target genes.

library(dorothea)
library(OmnipathR)
library(decoupleR)

#it is available for only human and mouse
net <- decoupleR::get_dorothea(levels = c('A', 'B', 'C', 'D'), organism = "human")
[2023-08-29 09:44:20] [SUCCESS] [OmnipathR] Loaded 278830 interactions from cache.
head(net)
# A tibble: 6 × 4
  source confidence target   mor
  <chr>  <chr>      <chr>  <dbl>
1 MYC    A          TERT    1   
2 JUN    D          SMAD3   0.25
3 SMAD3  A          JUN     1   
4 JUN    D          SMAD4   0.25
5 SMAD4  A          JUN     1   
6 RELA   D          FAS     0.25
#confidence level go from A to D, A bing the most confident and D being the less
#mor:mode of regulation (-1 or 1), one for each confidence level. Bigger values will generate weights close to zero

#subsetting table for MYC gene (a TF)
myc <- net[net$source %in% c("MYC"),]
head(myc)
# A tibble: 6 × 4
  source confidence target   mor
  <chr>  <chr>      <chr>  <dbl>
1 MYC    A          TERT       1
2 MYC    A          ENO1       1
3 MYC    A          CDC25A     1
4 MYC    A          CXCR4      1
5 MYC    A          CDKN1A    -1
6 MYC    A          TP53       1