Getting Tab-Delimited Pmids And Abstracts From Pubmed
3
5
Entering edit mode
13.4 years ago
Ryan D ★ 3.4k

I have a list of 4000+ PMIDs from a meta-analysis query for which I'd like to pull abstracts from Pubmed. I can get title, description, details, nd other items using Batch Entrez http://www.ncbi.nlm.nih.gov/sites/batchentrez? But I can't find a way to get PMIDs and abstracts one per line. Is there a way to do this that doesn't require me to break up my query?

pubmed meta • 7.4k views
ADD COMMENT
5
Entering edit mode
13.4 years ago
Nakao ▴ 200

If your list of PMIDs is in a file, pmids.txt with one PMID per line, here's a shell script with TogoWS solution:

for i in `cat pmids.txt`; do echo -n $i;ruby -e 'print "\t"'; curl "http://togows.dbcls.jp/entry/ncbi-pubmed/$i/ab"; done
ADD COMMENT
1
Entering edit mode

I keep a list of searchable fields for Entrez databases at a public Dropbox: http://dl.dropbox.com/u/1304874/entrez/index.html.

ADD REPLY
1
Entering edit mode

You can see the available entry fields at http://togows.dbcls.jp/entry/ncbi-pubmed?fields , and the available data sources at http://togows.dbcls.jp/entry/ . More details: http://togows.dbcls.jp/site/en/rest.html

ADD REPLY
0
Entering edit mode

Good lord, that is an impressive one-liner.

So, in addition to .ab and .pmid, where is there a list of all the suffixes and their corresponding fields which can be pulled from Medline or other NCBI databases in this manner?

ADD REPLY
4
Entering edit mode
13.4 years ago

Once you get all your articles, click on the button send-to /File/XML and save the articles as XML.

Then apply this XSLT-stylesheet ( xsltproc --novalid stylesheet.xsl pubmed_result.txt )

<xsl:stylesheet version="1.0" xmlns:xsl="&lt;a href=" http:="" www.w3.org="" 1999="" XSL="" Transform"="" rel="nofollow">http://www.w3.org/1999/XSL/Transform" 
    >

<xsl:output method="text"/>
<xsl:template match="/">
    <xsl:for-each select="/PubmedArticleSet/PubmedArticle">
        <xsl:value-of select="MedlineCitation/PMID"/>
        <xsl:text>  </xsl:text>
        <xsl:value-of select="normalize-space(MedlineCitation/Article/Abstract)"/>
        <xsl:text>
</xsl:text>
    </xsl:for-each>
</xsl:template>

</xsl:stylesheet>
ADD COMMENT
0
Entering edit mode

Thanks, Pierre. This worked on the file I had. I had pulled an XML file but was unaware of this xsltproc command. I'd not worked with XML files before.

ADD REPLY
3
Entering edit mode
13.4 years ago
Neilfws 49k

Some parsing is required in cases like this.

If your list of PMIDs is in a file, pmids.txt with one PMID per line, here's a BioRuby solution:

#!/usr/bin/ruby

require "rubygems" # ruby1.8
require "bio"

Bio::NCBI.default_email = "me@me.com"

File.read("pmids.txt").each do |line|
  article = Bio::PubMed.query(line)
  medline = Bio::MEDLINE.new(article)
  puts "#{medline.pmid}\t#{medline.ab}"
end
ADD COMMENT
0
Entering edit mode

Neil, this is a quite elegant solution. The more I learn about ruby, the more I am impressed by it.

ADD REPLY

Login before adding your answer.

Traffic: 2512 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6