Map ensembl gene id form hg38 in hg19
0
2
Entering edit mode
8.5 years ago

I want to map all ensembl gene id form hg38 in hg19. Any help will be appreciated ? Thanks !!

Example :

Loading library

library(biomaRt)

List of miRNA form hg38

grch38     <- useMart("ensembl",dataset="hsapiens_gene_ensembl")
miRNA38    <- getBM( attributes=c("ensembl_gene_id","transcript_biotype"),
                     filters=c("transcript_biotype"),values=list("miRNA",TRUE), mart=grch38)

Result : Total 4555 ensembl gene id

List of miRNA form hg19

grch37     <- useMart(biomart="ENSEMBL_MART_ENSEMBL", host="grch37.ensembl.org",
                      path="/biomart/martservice",dataset="hsapiens_gene_ensembl")
miRNA37    <- getBM(attributes=c("ensembl_gene_id","transcript_biotype"),
                     filters=c("transcript_biotype"),values=list("miRNA",TRUE), mart=grch37)

Result: Total 3411 ensembl gene id

Extraxt hg38/GRCH38 ensembl_gene_id form hg19/GRCH37

en_id_hg38  <- miRNA38$ensembl_gene_id
miRNA38_19  <- getBM( attributes=c("ensembl_gene_id","transcript_biotype"),
                     filters=c("ensembl_gene_id"),values=list(en_id_hg38,TRUE), mart=grch37)

Result: Total 2802 ensembl gene id. But rest of 1753 (4555-2802) ensembl gene id (hg38) are not mapped in hg19.

Now How to map these 1753 hg38 ensembl id in hg19 ?

sequencing hg38 hg19 • 8.9k views
ADD COMMENT
0
Entering edit mode

Unless there have been changes in the gene structure, the Ensembl gene ID should be the same across releases or assemblies e.g. the Ensembl gene ID for BRCA2 in both GRCh38 and GRCh37 is ENSG00000139618. However, minor changes on the UTR for example will imply a different ID being given. Have you got a gene (or list of genes) and do you know the changes between them in the different assemblies?

ADD REPLY
0
Entering edit mode

Example added to main question !!

ADD REPLY
0
Entering edit mode

As far I understand, the OP'd like to convert IDs, not coordinates though.

ADD REPLY
0
Entering edit mode

That's true, my fault.

ADD REPLY
0
Entering edit mode

There could be two things going on here. Firstly, some of the IDs found in GRCh38 but not in GRCh37 could be simply due to the fact that the loci were not annotated in GRCh37 at all, but rather just in GRCh38. The other possibility is that the loci are in GRCh37 but got a different ENSG ID in GRCh38, in case there was some changes in the models. Perhaps you could get the latest GTF files from the Ensembl FTP sites and compare them (GRCh38 and GRCh37).

ADD REPLY
0
Entering edit mode

Thanks your your time and reply.

I have also tried with GTF files from Ensembl FTP for both GRCh37, GRCh38 and tried to map ENSG ID from GRCh38 in GRCh37 for miRNA (BioType). But could not figured out an active solution. :(

ADD REPLY
0
Entering edit mode

Post a few examples of ID's that do not map so @Denise can figure out what is going on.

ADD REPLY
0
Entering edit mode

Or better still, email the Ensembl helpdesk as they can find out if this is a feature or a bug, if you know what I mean.

ADD REPLY

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6