Question

Where and how to fetch phenotype data of the relevant SRA-identified tuberculosis WGS samples?

0

Entering edit mode

8.0 years ago

bioinform ▴ 30

Where and how to fetch phenotype data of the relevant SRA-identified tuberculosis WGS samples? I am developing GWAS-based software, want to fetch pehnotype data from web, all the relevant available information to the fastqs I have. Is that information available for a single WGS sample and in what format it is?

sra gwas association software • 2.4k views

ADD COMMENT • link updated 8.0 years ago by Istvan Albert 103k • written 8.0 years ago by bioinform ▴ 30

0

Entering edit mode

What kind of phenotypes are you looking for? Why only for one organism?

ADD REPLY • link 8.0 years ago by Michael 56k

0

Entering edit mode

Hello bioinform/vassialk!

We believe that this post does not fit the main topic of this site.

Not a real question

For this reason we have closed your question. This allows us to keep the site focused on the topics that the community can help with.

If you disagree please tell us why in a reply below, we'll be happy to talk about it.

Cheers!

ADD REPLY • link 8.0 years ago by Michael 56k

0

Entering edit mode

How is fetching phenotype data not bioinformatics? It is one most important tasks (yet annoyingly difficult) that one needs to perform when attempting to reproduce results from papers. Alas the information (if available) is hidden away inside XML files (as my examples shows).

In addition, answers should not be closed when an answer has been submitted.

ADD REPLY • link 8.0 years ago by Istvan Albert 103k

0

Entering edit mode

This is a bot question, I have always flagged them like that. It is most likely the vassialk bot again. Look at the similar questions, and the post history of this user. I don't want to deal with automatic users or invest more effort.

ADD REPLY • link 8.0 years ago by Michael 56k

0

Entering edit mode

Ok that makes sense and it warrants the closure - but you should state it more clearly that you believe this to be a bot.

ADD REPLY • link 8.0 years ago by Istvan Albert 103k

score 0 · Answer 1 · 2017-08-07

You can get the public summary as an XML document though when it comes to clinical data phenotype information may be limited (not publicly available).

 # This will take a while
 esearch -query PRJNA46337 -db sra | efetch --format summary > summary.xml
 # Extract information
 cat summary.xml | xtract -pattern "EXPERIMENT_PACKAGE" -element PRIMARY_ID, TITLE | cut -f 8,10 | tail

will produce :

SRR1614838  Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614836  Human metagenome DNA sample from stool of a male participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614835  Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614833  Human metagenome DNA sample from stool of a male participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614832  Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"
SRR1614830  Human metagenome DNA sample from stool of a female participant in the dbGaP study "The Neonatal Microbiome and NEC"