Obtaining GSM numbers from SRA ids
2
0
Entering edit mode
6.2 years ago
Dirk ▴ 100

I have a lot of SRS values from the SRA values/ids for relevant entries in the SRA, and I would like to find the corresponding GSM ids that match to these. I can find them via the SRA webpage manually, but would really like a way to use curl or python (not R) to get these values--is that possible?

RNA-Seq sra • 2.6k views
ADD COMMENT
2
Entering edit mode
6.2 years ago
GenoMax 147k

Can you give an example to test?

Take a look at this answer: A: Get a complete GSM-to-SRX/SRR table

ADD COMMENT
0
Entering edit mode

Awesome, this looks like the best answer. It's a bit unfortunate you have to load a ~4GB file, but its still completely viable!

ADD REPLY
0
Entering edit mode
6.0 years ago
vkkodali_ncbi ★ 3.8k

If you don't want to download the entire SRA_Accessions.tab file, you can use Entrez Direct for this:

esearch -db sra -q 'SRS213308' | elink -db sra -target gds -name sra_gds | esummary | xtract -pattern DocumentSummary -subset DocumentSummary -if Accession -starts-with 'GSM' -first Accession
GSE30198
GSM747492
ADD COMMENT
0
Entering edit mode

Curious why this is picking a GSE accession when you are asking for GSM?

ADD REPLY
0
Entering edit mode

The SRS accession is linked to both the GSM sample accession and the GSE series accessions in GEO Datasets. I could not come up with a good xtract command that will filter out the GSE hit. But it can be easily dropped using unix grep.

ADD REPLY

Login before adding your answer.

Traffic: 2719 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6