Question

How Does One Download An Xml Formatted List Of Cited Article From Pubmed?

1

Entering edit mode

12.4 years ago

Burke ▴ 290

I would like to analyze some metadata about a publication and I have a perl script that parses PubMed XML formatted files. However, I do not see a way to download the "cited by" list as XML. Is there a way to do this?

xml pubmed • 5.4k views

ADD COMMENT • link updated 12.4 years ago by Pierre Lindenbaum 164k • written 12.4 years ago by Burke ▴ 290

score 4 · Answer 1 · 2012-08-29

Use NCBI-ELink. For example for pmid:19755503 in http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?retmode=xml&dbfrom=pubmed&id=19755503&cmd=neighbor , see the node 'cited_in'

        </Link>
        <Link>
            <Id>21591145</Id>
        </Link>
    </LinkSetDb>
    <LinkSetDb>
        <DbTo>pubmed</DbTo>
        <LinkName>pubmed_pubmed_citedin</LinkName>
        <Link>
            <Id>22644393</Id>
        </Link>
        <Link>
            <Id>22587672</Id>
        </Link>
        <Link>
            <Id>22541597</Id>
        </Link>
        <Link>
            <Id>22438567</Id>
        </Link>
        <Link>
            <Id>22434829</Id>
        </Link>
        <Link>

EDIT: the following xslt stylesheet will download and merge all the pubmed XML records:


<xsl:stylesheet xmlns:xsl="&lt;a href=" <a="" href="http://www.w3.org/1999/XSL/Transform" rel="nofollow">http://www.w3.org/1999/XSL/Transform" "="" rel="nofollow">http://www.w3.org/1999/XSL/Transform'
    version='1.0'
    >
<xsl:output method="xml" encoding="UTF-8"/>

<xsl:template match="/">
<MERGED>
    <xsl:for-each select="//LinkSetDb[LinkName='pubmed_pubmed_citedin']/Link/Id">
        <xsl:variable name="url" select="concat('&lt;a href=" http:="" eutils.ncbi.nlm.nih.gov="" entrez="" eutils="" efetch.fcgi?db="pubmed&amp;retmode=xml&amp;id=',.)" "="" rel="nofollow">http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=pubmed&retmode=xml&id=',.)"/>
    <xsl:message>downloading <xsl:value-of select="$url"/></xsl:message>
    <xsl:copy-of select="document($url)/PubmedArticleSet[1]/PubmedArticle[1]"/>
    </xsl:for-each>
</MERGED>
</xsl:template>


</xsl:stylesheet>

usage:

 xsltproc --novalid stylesheet.xsl "http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?retmode=xml&dbfrom=pubmed&id=19755503&cmd=neighbor"  > pubmed_result.xml

score 3 · Answer 2 · 2012-08-29

You can try our R package rentrez here: https://github.com/ropensci/rentrez

To install:

install_github('rentrez', 'ropensci')
library(rentrez)

Then see the function entrez_link

entrez_link(db='pubmed', dbfrom='pubmed', retmode='xml', id=19755503, cmd='neighbor')$file

Get results


http://www.ncbi.nlm.nih.gov/entrez/query/DTD/eLink_101123.dtd">
<eLinkResult>
  <LinkSet>
    <DbFrom>pubmed</DbFrom>
    <IdList>
      <Id>19755503</Id>
    </IdList>
    <LinkSetDb>
      <DbTo>pubmed</DbTo>
      <LinkName>pubmed_pubmed</LinkName>
      <Link>
        <Id>19755503</Id>
      </Link>
      <Link>
        <Id>22075991</Id>
      </Link>

Get the IDs using

sapply(xpathApply(out, "//Link", xmlValue), as.numeric)