Missing gene symbols in biomart
2
0
Entering edit mode
10.2 years ago
oganm ▴ 60

I have a bunch of swiss prot IDs that I want to convert to hgnc symbols. To do that I am using biomart and for most genes, it works. But for a small minority, it cannot find a corresponding symbol for a given ID even though web interface of biomart successfully handles the conversion.

I added an example for a single gene below.

humanMart = useMart("ensembl", dataset="hsapiens_gene_ensembl")
humanTrans = getBM(attributes = c('uniprot_swissprot','hgnc_symbol','ensembl_gene_id'),
                       # just take the human ones. just in case...
                       filters = 'uniprot_swissprot',
                       values = 'Q9Y2R4',
                       mart = humanMart)

humanTrans[humanTrans$uniprot_swissprot %in% 'Q9Y2R4',]
      uniprot_swissprot hgnc_symbol ensembl_gene_id
15191            Q9Y2R4             ENSG00000277594
R biomart • 4.8k views
ADD COMMENT
2
Entering edit mode
10.2 years ago
Neilfws 49k

R/biomaRt connects to the exact same data source as the Ensembl web interface and should yield equivalent results if used correctly.

The example UniProt accession that you give does not map to a HGNC symbol using the web interface (may need to click "Results" to see this result).

ADD COMMENT
0
Entering edit mode

Thanks. In that case do you know what is the data for biomart's ID converter web interface is coming from?

ADD REPLY
0
Entering edit mode

From Ensembl.

ADD REPLY
0
Entering edit mode
10.2 years ago
cdsouthan ★ 1.9k

You small minority should be the difference between these results

but it you add in the Ensembl mappings (from the UniProt side) the intersect drops some more

There are a number of reasons for the individual mismatches

ADD COMMENT

Login before adding your answer.

Traffic: 3093 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6