Is there any way through command line EUtils to identify publication(s) associated with a particular BioProject, or from a run entered in SRA database?
For example, something like this (but this doesn't work):
esearch -db bioproject -query "PRJEB31886" | elink -target pubmed
As far as I know, I have to highlight the title of this BioProject and search Google, Pubmed, etc., for a paper with the exact title match. This is cumbersome and hurts my bioinformatically-inclined brain. Looking for a streamlined, command line-friendly way to retrieve a PMID associated with a BioProject, if it exists.
Thanks!
This specific BioProject does not seem to be linked to any PubMed article. Do you know if the author has published a paper?
vkkodali : Is it the authors responsibility to link a publication to the data or can NCBI do this automatically by text mining from PubMed, if the article includes the accession number?
If an identifier from BioProject, BioSample, SRA, GEO, etc are mentioned in the publication, they get picked up automatically and the inter-database connections are made. That said, an author or user can (and highly encouraged to) write to NCBI Helpdesk to notify that a publication is no now out and a connection between that publication and data needs to be made.
I also learned that NCBI discourages authors from putting their BioProject accession in their manuscript (but they can include the SRA entries). Should I cite BioProject accession numbers in my manuscript?
NCBI often boggles the mind. What if one produces 500 SRA experiments? Should they now list each number separately in the paper?
I perhaps understand the sentiment, they try to discourage people from linking to the bioproject alone as the main entry point.
The bioproject ID this thread is talking about has 4 samples and 12 experiments. There can be more than one publication associated with a bioproject ID.
Can someone else do this on behalf of the authors? There is no link between that BioProject PRJEB31886 and their paper. However, the BioProject ID is mentioned in the paper. EDIT: I just realized that there is no XML output available for that BioProject (it says the ID 31886 is not public; I used Istvan's posted solution which works with his example BioProject accession but not mine). So I guess it's some other issue, unrelated to NCBI linking the BioProject to its associated publication.
This particular paper was submitted via ENA (European Nucleotide Archive) and perhaps before the paper was accepted. This might be a reason it is not properly crosslinked in PubMed.
Going through ENA does show the paper (you have to click Show under Publication tab)
https://www.ebi.ac.uk/ena/browser/view/PRJEB31886
perhaps there is an automatable query for that
I believe it is this paper, with the same title as the project. I found it by searching the BioProject title in Google. But this is too cumbersome for hundreds (or even tens...) of these. EDIT: It's definitely this paper. They have mentioned the BioProject ID in the paper itself (and they indicate Genbank accessions; though not SRA accessions).