Retrieve transcription factors from gene set using biomaRt
1
0
Entering edit mode
2.3 years ago

I have a dataframe meth which has genes (HGNC symbol) as rownames and samples as column names. I want to find which gene in the rownames are transcription factors using biomaRt in R. This list should be returned as a vector.

Example:

> rownames(meth)
   [1] "A1BG"            "A1CF"            "A2BP1"           "A2LD1"          
   [5] "A2M"             "A2ML1"           "A4GALT"          "AAAS"   

If AIBG, A2BP1, and A2LD1 are transcription factors, return as vector:

[1] "A1BG"  "A2BP1" "A2LD1"

On the biomart website, I can choose for example: Database: Ensembl Regulation 107 Dataset: Human Regulatory Features

But I want to find the TFs using R code.

My preliminary attempt did not filter for transcription factors.

# Biomart query
if(interactive()){
  mart <- useEnsembl(biomart = "ensembl",
                     dataset = "hsapiens_gene_ensembl")
  getBM(attributes = c("ensembl_gene_id", "p_value", "hgnc_symbol", "entrezgene_id"),
        values = as.vector(rownames(meth)),
        mart = mart)
}
factors R transcription biomaRt • 1.1k views
ADD COMMENT
0
Entering edit mode

Hi, you could download the list of human TFs from this website: http://humantfs.ccbr.utoronto.ca/download.php

(This TF list is part of this Cell review: https://www.sciencedirect.com/science/article/pii/S0092867418301065?via%3Dihub)

Then, you can check which of these TFs match with the rownames of your dataframe using a R function like inner_join from the dplyr package.

ADD REPLY
0
Entering edit mode
2.3 years ago
ATpoint 86k

Use a dedicated database rather than reinventing the wheel. We usually use http://bioinfo.life.hust.edu.cn/AnimalTFDB/#!/download

ADD COMMENT

Login before adding your answer.

Traffic: 2016 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6