Where and how to fetch phenotype data of the relevant SRA-identified tuberculosis WGS samples? I am developing GWAS-based software, want to fetch pehnotype data from web, all the relevant available information to the fastqs I have. Is that information available for a single WGS sample and in what format it is?
How is fetching phenotype data not bioinformatics? It is one most important tasks (yet annoyingly difficult) that one needs to perform when attempting to reproduce results from papers. Alas the information (if available) is hidden away inside XML files (as my examples shows).
In addition, answers should not be closed when an answer has been submitted.
This is a bot question, I have always flagged them like that. It is most likely the vassialk bot again. Look at the similar questions, and the post history of this user. I don't want to deal with automatic users or invest more effort.
You can get the public summary as an XML document though when it comes to clinical data phenotype information may be limited (not publicly available).
# This will take a while
esearch -query PRJNA46337 -db sra | efetch --format summary > summary.xml
# Extract information
cat summary.xml | xtract -pattern "EXPERIMENT_PACKAGE" -element PRIMARY_ID, TITLE | cut -f 8,10 | tail
will produce :
SRR1614838 Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614836 Human metagenome DNA sample from stool of a male participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614835 Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614833 Human metagenome DNA sample from stool of a male participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614832 Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614830 Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
I don't think SRA requires phenotypic data with all submissions, correct? bioinform : This will only work if the original submitters have provided this data.
What kind of phenotypes are you looking for? Why only for one organism?
Hello bioinform/vassialk!
We believe that this post does not fit the main topic of this site.
Not a real question
For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.
If you disagree please tell us why in a reply below, we'll be happy to talk about it.
Cheers!
How is fetching phenotype data not bioinformatics? It is one most important tasks (yet annoyingly difficult) that one needs to perform when attempting to reproduce results from papers. Alas the information (if available) is hidden away inside XML files (as my examples shows).
In addition, answers should not be closed when an answer has been submitted.
This is a bot question, I have always flagged them like that. It is most likely the vassialk bot again. Look at the similar questions, and the post history of this user. I don't want to deal with automatic users or invest more effort.
Ok that makes sense and it warrants the closure - but you should state it more clearly that you believe this to be a bot.