Question

Downloading fasta sequences for PDB IDs in bulk

0

Entering edit mode

9.7 years ago

amruta.biotech • 0

Hello,

I have a list of 3000 pdb IDs, for which I need (1) fasta sequences from PDB and (2) Uniprot sequences. Is there a simpler way of downloading the sequences instead of manually downloading for each ID?

thanks in advance!

sequence • 7.9k views

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by amruta.biotech • 0

Ram · Answer 1 · 2015-04-24

1

Entering edit mode

9.7 years ago

GouthamAtla 12k

Lets say you have a txt file with pdb IDs:

2AID
4RLB

You can do something like:

parallel -a pdb_list.txt curl -o {}.fasta http://www.rcsb.org/pdb/files/fasta.txt?structureIdList={}

If you do not have parallel:

while read line; do curl -o ${line}.fasta http://www.rcsb.org/pdb/files/fasta.txt?structureIdList=${line}; done < pdb_list.txt

Try similar approach for Uniprot

ADD COMMENT • link 9.7 years ago by GouthamAtla 12k

0

Entering edit mode

Thanks for your reply,

Yes I have a text file with 3000 PDB IDs. I want 3000 fasta files of the corresponding IDs downloaded. Do you mean, I need to go to this website (which is not working) http://www.rcsb.org/pdb/files/fasta.txt?structureIdList={} and enter parallel -a pdb_list.txt curl -o {}.fasta?

ADD REPLY • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by amruta.biotech • 0

0

Entering edit mode

No. Please do it from your terminal on Mac or Linux

ADD REPLY • link 9.7 years ago by GouthamAtla 12k

0

Entering edit mode

From Windows platform please?

ADD REPLY • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by amruta.biotech • 0

0

Entering edit mode

The uniprot facility which I link to below is web-based and therefore works for any platform. The limit of number of IDs you can give (in a file) is in the 10s of thousands, so you shouldn't have a problem. If you do, contact uniprot help.

ADD REPLY • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by sarahhunter ▴ 600

score 0 · Answer 2 · 2015-04-24

0

Entering edit mode

9.7 years ago

HG ★ 1.2k

Have a look

http://seqanswers.com/forums/showthread.php?p=149802

ADD COMMENT • link 9.7 years ago by HG ★ 1.2k

Ram · Answer 3 · 2015-04-27

0

Entering edit mode

9.7 years ago

sarahhunter ▴ 600

Uniprot provides a facility to do bulk download through their website.

http://www.uniprot.org/uploadlists/

Through this feature you can get any sequence contained within the uniparc archive, including PDBseqs

ADD COMMENT • link updated 2.5 years ago by Ram 44k • written 9.7 years ago by sarahhunter ▴ 600

0

Entering edit mode

Thanks,

Can you help me downloading the PDB fasta sequences if PDB IDs are submitted?

ADD REPLY • link 9.7 years ago by amruta.biotech • 0