Dear list,
I am trying to map between uniprot accesions and gene symbol (Hugo official gene symbol). I've used different R approaches and a mysql approach posted in this list before.
My R approaches use org.Hs.eg and BiomaRt packages. However there are several uniprot accesions that do not map to gene_symobl. I've tried a mysql approach posted by @Pierre Lindenbaum and it solves some cases, but I would like to modify it for obtaining gene_symbols and not ensembl_id, although I could map ensembl_id to gene symbol afterwards.
mysql approach
echo -e "Q7TNF6\nQ53XJ8\nP05787\nP0CG48\nQ96CG1\nD3DR86\nQ96FS5" |awk '{printf("select REF.acc,REF.extAcc1,REF.extAcc2,REF.extAcc3 from uniProt.extDbRef as REF, uniProt.extDb as EXT where EXT.val=\"ENSEMBL\" and EXT.id=REF.extDb and REF.acc=\"%s\";\n",$0);}' |mysql --user=genome --host=genome-mysql.cse.ucsc.edu -A -D hg19 -N
R approaches
library('org.Hs.eg')
annotation.col1 <- select(org.Hs.eg.db, keys=c('Q7TNF6','Q53XJ8','P05787','P0CG48','Q96CG1','D3DR86','Q96FS5'), cols=c('UNIPROT', 'SYMBOL', 'ENTREZID'), keytype="UNIPROT")
library('biomaRt')
ensembl <- useMart('ensembl', dataset="hsapiens_gene_ensembl")
annotation <- getBM(attributes=c("uniprot_swissprot_accession", "hgnc_symbol", "uniprot_genename"), filters="uniprot_swissprot_accession", values=c('Q7TNF6','Q53XJ8','P05787','P0CG48','Q96CG1','D3DR86','Q96FS5'), mart=ensembl)
Also I noticed that mapping using `org.Hs.eg' and 'biomaRt' are different in terms of no-matching uniprot accesions.
Thanks a lot