is there a way to get SRA info (Paired Library, Sample id, Seq Tech... etc) using a list of SRA IDS?
1
1
Entering edit mode
6.0 years ago
S AR ▴ 80

I have a 1000 ids list of sra ids and i want to get the information of all ids in a table format. The info available at the sra read page like below:

https://www.ncbi.nlm.nih.gov/sra/?term=ERR038738

I need:

Read Id: ERR038738
Library:
Name: 2496237
Instrument: Illumina HiSeq 2000
Strategy: WGS
Source: GENOMIC
Layout: PAIRED
Experiment id: ERX015934
Sample accession ERS023468
Study accession ERP000520
Sample: 19744-sc-2011-02-15-1079093

Can any body help?

sra awk • 3.7k views
ADD COMMENT
3
Entering edit mode
6.0 years ago
Sej Modha 5.3k

You can download SRA data in runinfo format that provides a comma-separated tabular output.

esearch -db sra -query ERR038738|efetch -format runinfo
ADD COMMENT
0
Entering edit mode

Parallelized:

function mymeta {
  esearch -db sra -query $1 | efetch --format runinfo | tr ',' '\t' < /dev/stdin
}; export -f mymeta

cat accessions.txt | parallel -j 4 mymeta {}

From there, you can awk around as you like. By the way @OP, this solution I provided you already in your previous question on how to bulk download files. It was part of the command I suggested. Too bad you apparently did not invest time to understand how the command worked because you could have solved this question here yourself.

ADD REPLY
0
Entering edit mode

@ATpoint I tried it but it is not giving me any info in it. Your command worked on the file that i created in linux but as i give a bulk of ids it started giving errors. But i broke the list in 3 halves and i was able to download data. But it is not giving me information which for which im asking here.

ADD REPLY
0
Entering edit mode

And now i looked it that if i remove that coloumn cutting command it gives me the info table. Thanks Again ATpoint for help. Can you please explain the above command ? what is mymeta?

ADD REPLY
0
Entering edit mode

As by reading i understand that you have made a function with the name mymeta. but this can be run in linux directly i guess. should i make it a python script? does python have builtin esearch efetch module??

As when i tried to run it as it is by making a bash script it is saying:

/bin/bash: mymeta: command not found

ADD REPLY
0
Entering edit mode

I tried from your previous command the following:

cat ../MDR.txt | parallel -j 4 "esearch -db sra -query {} | efetch --format runinfo"

It is f=giving me the results but it is again again getting the headers for each entry as well. Is there a way to get the headers once only in the start and the values of each sra ids in each rows.

ADD REPLY

Login before adding your answer.

Traffic: 2973 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6