biomart function getbm can not get chromesome location of some genes
1
1
Entering edit mode
5.0 years ago
Feng Zhang ▴ 10

I am getting the chromosome location of a human gene list using getbm. However, getbm can not return some genes' locations, such as "LILRA3".

getBM(attributes=c("chromosome_name",'hgnc_symbol',"start_position","end_position"), filters ="hgnc_symbol", values = "LILRA3",  mart = ensembl)

 chromosome_name hgnc_symbol start_position end_position
1 CHR_HSCHR19LRC_COX2_CTG3_1      LILRA3       54294043     54306352
2 CHR_HSCHR19LRC_COX1_CTG3_1      LILRA3       54294163     54306461
3 CHR_HSCHR19LRC_PGF2_CTG3_1      LILRA3       54294418     54306726
4       CHR_HSCHR19_4_CTG3_1      LILRA3       54294391     54301485

But searching the gene manually on NCBI, you know LILRA3 is on chromosome 19.

Is this because the ensemble data is not updated or some other reason? how to solve this?

Thanks!

---------------update-----------------------

I think I have an answer. The gene list I got may come from an early version of human genome annotation, and some gene symbol is not in the current version. So I can not find them using biomart or in the current version gff3 annotation file.

However, I still don`t know what was changed among different version of human genome annotation, and why some gene symbol is no longer mapped to chromosome.

gene genome R biomart • 1.7k views
ADD COMMENT
0
Entering edit mode

What's your ensembl? ensembl <- useMart("ensembl", dataset="hsapiens_gene_ensembl") ?

ADD REPLY
0
Entering edit mode

yes and I have an answer to update

ADD REPLY
0
Entering edit mode
5.0 years ago

This is just one of those 'problematic' regions of the genome, this specific one outlined here: Human Genome Issue HG-1207

Curiously enough, it has a chr19 entry in GRCh37 version of Ensembl:

require('biomaRt')

hg38

mart <- useMart("ENSEMBL_MART_ENSEMBL")
mart <- useDataset("hsapiens_gene_ensembl", mart)
getBM(
  attributes=c('chromosome_name',
    'hgnc_symbol','start_position','end_position'),
  filters = 'hgnc_symbol',
  values = 'LILRA3',
  mart = mart)

             chromosome_name hgnc_symbol start_position end_position
1 CHR_HSCHR19LRC_COX2_CTG3_1      LILRA3       54294043     54306352
2 CHR_HSCHR19LRC_COX1_CTG3_1      LILRA3       54294163     54306461
3 CHR_HSCHR19LRC_PGF2_CTG3_1      LILRA3       54294418     54306726
4       CHR_HSCHR19_4_CTG3_1      LILRA3       54294391     54301485

hg19

mart <- useMart('ENSEMBL_MART_ENSEMBL', host = 'grch37.ensembl.org')
mart <- useDataset("hsapiens_gene_ensembl", mart)
getBM(
  attributes=c('chromosome_name',
    'hgnc_symbol','start_position','end_position'),
  filters = 'hgnc_symbol',
  values = 'LILRA3',
  mart = mart)

       chromosome_name hgnc_symbol start_position end_position
1                   19      LILRA3       54799854     54809952
2 HSCHR19LRC_COX1_CTG1      LILRA3       54799626     54809724
3 HSCHR19LRC_COX2_CTG1      LILRA3       54799506     54809605
4 HSCHR19LRC_PGF2_CTG1      LILRA3       54799881     54809979
ADD COMMENT

Login before adding your answer.

Traffic: 2487 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6