Entering edit mode
5.5 years ago
joshua.theisen
▴
30
I am trying to download the RunInfo Table for a number of entries in the BioProject or SRA databases. I have installed Entrez Direct and have esearch
and efetch
working, but they don't provide the same information as downloading the RunInfo Table manually.
When I run this code:
esearch -db sra -q 'PRJNA417256' | efetch -format runinfo > PRJNA417256.sra.csv
I get a .csv with the following fields:
- Run
- ReleaseDate
- LoadDate
- spots
- bases
- spots_with_mates
- avgLength
- size_MB
- AssemblyName
- download_path
- Experiment
- LibraryName
- LibraryStrategy
- LibrarySelection
- LibrarySource
- LibraryLayout
- InsertSize
- InsertDev
- Platform
- Model
- SRAStudy
- BioProject
- Study_Pubmed_id
- ProjectID
- Sample
- BioSample
- SampleType
- TaxID
- ScientificName
- SampleName
- g1k_pop_code
- source
- g1k_analysis_group
- Subject_ID
- Sex
- Disease
- Tumor
- Affection_Status
- Analyte_Type
- Histological_Type
- Body_Site
- CenterName
- Submission
- dbgap_study_accession
- Consent
- RunHash
- ReadHash
When I manually download the RunInfo Table I get a .csv with the following fields:
- Run
- Antibody
- Assay Type
- AvgSpotLen
- BioProject
- BioSample
- cell_line
- Center Name
- Consent
- DATASTORE filetype
- DATASTORE provider
- DATASTORE region
- Experiment
- GEO_Accession
- Instrument
- LibraryLayout
- LibrarySelection
- LibrarySource
- MBases
- MBytes
- Organism
- Platform
- ReleaseDate
- sample_acc
- Sample Name
- source_name
- SRA Study
- transfection
- treatment
Is there a way to get the contents of the RunInfo Table from the command line (particularly: Antibody, Assay Type, cell_line, Organism, source_name, transfection, treatment)?
Thanks.