Using Bioconductor Biomart to find gene name for sseqid from blastn results
0
0
Entering edit mode
8.1 years ago

Hi All

I need to find a gene name using the sseqid from blastn results: gi|19698730|gb|AC079789.7|

and I want to use the gi or gb id to find a gene name. I have parsed my data into an R dataframe where sseqid1 contains gi and sseqid2 contains gb.

This is my R code:

ensembl <- useMart('ENSEMBL_MART_ENSEMBL', 
                   dataset="cporcellus_gene_ensembl")
keys=as.character(res$sseqid1)
res$genename = getBM(
                   attributes=c('external_gene_name','protein_id',
                   'refseq_mrna_predicted','entrezgene'),
                   values=keys,mart=ensembl)

The last statement results in an error because of the biomart result set. My guess is that it returns more than one result per gi but I can't really figure out what biomart is doing, why it is doing it and what I need to do to fix it.

What I want is a gene name for each sseqid. I don't care whether the gi or the gb is used. I can't find exact definitions for the abbreviations gi and gb and I haven't been able to find out which of the biomart attributes these relate to to. If this information is in the documentation, I have not been able to find it.

Is there anyone that can help me out to get this done?

Thanks in advance. Jannetta

blastn bioconductor RNA-Seq biomart • 1.5k views
ADD COMMENT

Login before adding your answer.

Traffic: 1606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6