Getting Protein Information From Mrna Refseq Id
2
1
Entering edit mode
12.2 years ago

Hello,

I would like to create an SVG similar to the figure below whiche is from COSMIC.
I would need the length of the protein and the start-end of the related protein domain. My input would be the the refseq mRNA ID.
So I would like to know if it is feasible to get the protein related information through the NCBI eUtilities or any other source like UCSC. Thanks in advance for any advices.

KRAS protein representation from COSMIC website

refseq protein • 4.5k views
ADD COMMENT
2
Entering edit mode
12.2 years ago

I don't know what kind of details you need, but last year , I wrote the following XSLT stylesheet:

https://github.com/lindenb/xslt-sandbox/blob/master/stylesheets/bio/ncbi/gb2svg.xsl

it convert a Genbank-XML (INSDSet) file to SVG

xsltproc gb2svg.xsl protein.xml > protein.svg

Pierre

ADD COMMENT
0
Entering edit mode

Hi Pierre, Thanks for your answer. Actually I am planning to do the svg using the Highcharts library but what I am really looking for is the protein related information using the mRNA refseq Id. May be through the E-utilities. So any url instance would be helpfull even if I am looking into it in the meantime.

ADD REPLY
0
Entering edit mode

mRNA -> protein ? use NCBI-ELink

ADD REPLY
1
Entering edit mode
12.2 years ago

So after looking into NCBI E-utilities I was able to get the protein information I was looking for using 3 of the 8 E-utilities :

ESearch (to get the mRNA uid from the mRNA acc):

this url : http://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi?db=nucleotide&term=NM_004985

gave me an xml output that contains the mRNA uid 34485723

ELink (to get the protein uid from the mRNA uid):

this url : http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=nucleotide&db=protein&id=34485723

gave me an xml output that contains the protein uid 15718761

EFetch (to get the protein information I was looking for from the protein uid):

this url : http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=protein&id=15718761&rettype=ft

gave me the following text with the information I was looking for :

>Feature ref|NP_004976.2|
1188Protein
            productGTPase KRas isoform b precursor
            productKirsten rat sarcoma-2 viral (v-Ki-ras2) oncogene homolog
            productv-Ki-ras2 Kirsten rat sarcoma 2 viral oncogene homolog
            producttransforming protein p21
            productc-Kirsten-ras protein
            productK-ras p21 protein
            productoncogene KRAS2
            productPR310 c-K-ras oncogene
            productcellular c-Ki-ras2 proto-oncogene
            productGTPase KRas
            productK-Ras 2
            productc-Ki-ras
2185mat_peptide
            productGTPase KRas isoform b
3164Region
            region_nameH_N_K_Ras_like
            noteRas GTPase family containing H-Ras,N-Ras and K-Ras4A/4B
            db_xrefCDD:133338
1017Site
            site_typenitrosylation
            noteG1 box
            db_xrefCDD:133338
1112Site
5960
            order
            site_typenitrosylation
            noteputative GDI interaction site [polypeptide binding]
            db_xrefCDD:133338
1218Site
2830
3232
3535
6060
116117
119120
145146
            order
            site_typenitrosylation
            noteGTP/Mg2+ binding site [chemical binding]
            db_xrefCDD:133338
1718Site
3032
3434
3737
4041
5455
5757
5961
6365
6767
6971
7373
102103
147147
            order
            site_typenitrosylation
            noteGEF interaction site [polypeptide binding]
            db_xrefCDD:133338
2525Site
3741
            order
            site_typenitrosylation
            noteeffector interaction site
            db_xrefCDD:133338
3340Site
            site_typenitrosylation
            noteSwitch I region
            db_xrefCDD:133338
3535Site
            site_typenitrosylation
            noteG2 box
            db_xrefCDD:133338
5760Site
            site_typenitrosylation
            noteG3 box
            db_xrefCDD:133338
5977Site
            site_typenitrosylation
            noteSwitch II region
            db_xrefCDD:133338
116119Site
            site_typenitrosylation
            noteG4 box
            db_xrefCDD:133338
145147Site
            site_typenitrosylation
            noteG5 box
            db_xrefCDD:133338

After it is just a matter of parsing and svg creation. Thanks Pierre for your input.

ADD COMMENT

Login before adding your answer.

Traffic: 2575 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6