I'm a total newbie to text mining, and I'm interested in the following. Say I'm using something like FlyMine, where I get enrichment results for publications, and I end up with a large number of them (over 1000). I have the publication info (title, authors, journal) and the PubMed IDs for all of them. Is there a text-mining tool I can straight-forwardly import them into (e.g. by uploading the PubMed IDs), that will give me an idea of their contents?
What do you mean by "an idea of their contents"? Do you mean retrieving the publication abstracts?
Having said that, you could use the abstracts to carry out an association analysis of ontology mentions within them. That gives you a network which you can import into Cytoscape, Gephi, etc, and then apply a smart layout algorithm to it. You will then immediately see clusters forming that give you an impression of associated term mentions.
Something like your latter suggestion sounds like what I'm looking for. How would you do the association analysis? Also, I'm not sure how you would apply a smart layout algorithm?
A while back, I created Pubmed2Wordle (app and repo) as a crude way to go from a PubMed search to a tag cloud (and also an exercise to learn Google App Engine). Just to emphasize, it's incredibly crude (and rather buggy as well)...
Proteinquest looks awesome - thanks for the link. I had no idea it existed.
I've used iHop before for looking at individual genes, but not sure how I would extend it to custom lists of publications (there might be a way I don't know about)
Hi, I just discovered a useful service from UK Pubmed that helps to focus on articles with a specific content/biological relationship: http://labs.ukpmc.ac.uk/evf.
Unfortunately It works only on the open access corpus (2Milion+ articles) of the whole pubmed, but It seems to look deep in the full text DB
What do you mean by "an idea of their contents"? Do you mean retrieving the publication abstracts?
Having said that, you could use the abstracts to carry out an association analysis of ontology mentions within them. That gives you a network which you can import into Cytoscape, Gephi, etc, and then apply a smart layout algorithm to it. You will then immediately see clusters forming that give you an impression of associated term mentions.
Something like your latter suggestion sounds like what I'm looking for. How would you do the association analysis? Also, I'm not sure how you would apply a smart layout algorithm?