Unfortunately NCBI does not contain metadata for this project.
This is not true. You can easily download a XML file containing all of the attributes of all the biosamples from NCBI. Since the procedure may also be useful in other contexts, I will describe it step by step.
First go to the page of the project (the bioproject database in NCBI speach):
https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJEB99111
Next, get a list of all biosamples which are linked to this project. There is a section entitled "Related information" on the right site of the page. To get the list of biosamples, click on the hyperlink "Biosample".
This will open an new page which list the first 20 biosamples in the project. The URL of that page is:
https://www.ncbi.nlm.nih.gov/biosample?LinkName=bioproject_biosample_all&from_uid=400734
On the top of this page (on the right site) is a pull-down menu entitled "Send to:". Click on this menu, then select "File", then select format "Full XML (text)", and finally click on the buttom "Create File". Store the XML file on your local disk and parse it with your favorite XML tool.
Exquisite solution. However, it only works for the first 3 samples and then the following error code is repeated many times:
Perhaps the site stoped granting us access thinking that we were not human. I dont know.
works on my machine
https://pastebin.com/sq6dzSKX
Astounding! Much appreciated. I wonder why It didnt fully work for my machine....
How can you make the xslt stylesheet so that sample names are rows and sample attributes are columns, and tab delimited? Example:
use datamash ? https://www.gnu.org/software/datamash/
or something in sqlite: A: formatting problem (awk/bash)
I had the same question and came up with a solution using datamash (note that you may have to put it on your machine using something like homebrew if you are on a Mac). Try out this code, which builds on the original solution above.
Hello,
I wanted to ask about the last part of the script you have written "xsltproc transform.xsl" .
If I ran the your whole script, I get an error that transform.xsl is not found " warning: failed to load external entity "transform.xsl" cannot parse transform.xsl". If I ran it step by step results are produced and I seem to get the correct xml but the transfomer is not working.
I am new to Unix but I understand the script up to the xslproc part. When I run " xslproc -h" there is no option transform.xsl. How does the module works does it need to be separately installed?
Thanks, Martina
did you download the XML/XSLT script above ?
No, I hadn't that works now. Thank you!