How To Convert Refseq Id To Gene Symbol For Non-Coding Rnas
2
1
Entering edit mode
13.2 years ago
Hamilton ▴ 290

Hi,

i'm trying to generate a full gene annotation table with corresponding gene symbols/gene descriptions and other gene IDs(ucscknown gene id, entrezid, ensembl id) to refseqID for ncRNAs as well as protein coding genes. it seems kgXref table generates such annotations for only prefix NM* protein coding genes. not NR*.

I would like to get as

e.g ucsc known id, entrezid, ensembl id, NR_045294(RefseqID), Gm4285(gene symbol), Mus musculus predicted gene 4285, non-coding RNA(gene description)

any thoughts?

non rna annotation database • 8.0k views
ADD COMMENT
0
Entering edit mode
13.2 years ago

You could use the following XSLT stylesheet:

<xsl:stylesheet version="1.0" xmlns:xsl="&lt;a href=" http:="" www.w3.org="" 1999="" XSL="" Transform"="" rel="nofollow">http://www.w3.org/1999/XSL/Transform"
    >
<xsl:output method="text"/>

<xsl:template match="/">
<xsl:apply-templates select="Bioseq-set/Bioseq-set_seq-set/Seq-entry"/>
</xsl:template>

<xsl:template match="Seq-entry">
<xsl:value-of select="Seq-entry_seq/Bioseq/Bioseq_id/Seq-id/Seq-id_other/Textseq-id/Textseq-id_accession"/>
<xsl:text>  </xsl:text>
<xsl:for-each select="Seq-entry_seq/Bioseq/Bioseq_annot/Seq-annot/Seq-annot_data/Seq-annot_data_ftable/Seq-feat/Seq-feat_dbxref/Dbtag[Dbtag_db='GeneID']">

<xsl:variable name="geneid" select="Dbtag_tag/Object-id/Object-id_id"/>

<xsl:variable name="url" select="concat('&lt;a href=" http:="" eutils.ncbi.nlm.nih.gov="" entrez="" eutils="" efetch.fcgi?db="gene&amp;retmode=xml&amp;id=',$geneid)" "="" rel="nofollow">http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=gene&retmode=xml&id=',$geneid)"/>

<xsl:value-of select="document($url)/Entrezgene-Set/Entrezgene/Entrezgene_gene/Gene-ref/Gene-ref_locus"/>
<xsl:text>
</xsl:text>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>

with NCBI efetch/nucleotide:

xsltproc --novalid stylesheet.xsl "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&rettype=db&retmode=xml&id=NR_045294" 
NR_045294   Gm4285
ADD COMMENT
0
Entering edit mode
13.2 years ago
Curiosity ▴ 130

You can download all these annotations from Ensembl http://asia.ensembl.org/biomart/martview/67bf1defc1d2b3e205f0fd5f4506f849

And then use the following command

grep NR_* filename
ADD COMMENT

Login before adding your answer.

Traffic: 1944 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6