Question

ChEMBL ID to Ensembl ID

0

Entering edit mode

3.8 years ago

Shicheng Guo ★ 9.6k

Here try to looking for a R or bash script - API to convert ChEMBL ID to Ensembl ID, for example:

ChEMBL ID (CHEMBL3705229) to Ensembl ID (ENSG00000105397)

https://www.ebi.ac.uk/chembl/assay_report_card/CHEMBL3705229/

https://useast.ensembl.org/Homo_sapiens/Gene/Summary?db=core;g=ENSG00000105397;r=19:10350529-10380572

Thanks for the sharing.

ChEMBL Ensembl • 1.5k views

ADD COMMENT • link updated 3.8 years ago by Pratik ★ 1.1k • written 3.8 years ago by Shicheng Guo ★ 9.6k

score 1 · Answer 1 · 2021-07-18

Okay so, just worked out something here - you will need to make a directory on your Desktop named Biostars... this script will just make a dataframe of CHEMBL, UNIPROT, and ENSEMBL GENE ID, all matched. and then you can do the rest of what you need from the data frame...

curl::curl_download(url = "ftp.ebi.ac.uk/pub/databases/chembl/ChEMBLdb/latest/chembl_uniprot_mapping.txt", "~/Desktop/Biostars/chembl_uniprot_mapping.txt", quiet = FALSE)
chembl <- read.table("~/Desktop/Biostars/chembl_uniprot_mapping.txt", sep = "\t")
chembl$V3 <- NULL
chembl$V4 <- NULL
colnames(chembl) <- c("uniprot", "chembl")


curl::curl_download(url = "ftp.uniprot.org/pub/databases/uniprot/current_release/knowledgebase/idmapping/by_organism/HUMAN_9606_idmapping_selected.tab.gz", "~/Desktop/Biostars/HUMAN_9606_idmapping_selected.tab.gz", quiet = FALSE)
system("gunzip ~/Desktop/Biostars/HUMAN_9606_idmapping_selected.tab.gz")
uniprot.db <- read.table("~/Desktop/Biostars/HUMAN_9606_idmapping_selected.tab", sep = "\t")
ensembl <- as.data.frame(uniprot.db$V19)
uniprot <- as.data.frame(uniprot.db$V1)
ensembl.uniprot <- cbind(ensembl, uniprot)
colnames(ensembl.uniprot) <- c("ensembl", "uniprot")

ensembl.uniprot.chembl <- merge(chembl, ensembl.uniprot, by = "uniprot")

Here is a sample of the data frame:

enter image description here