I want to download 28000 fasta files from the PDB. I applied some filters on the resolution, length, etc of the proteins. However, I cannot directly download them from the website since it only allows a 2500-sequences batch download. How can I make this download programatically? NOTE that I don't have the PDB ids, I just queried the PDB using some filters.
Yes, I thought of doing this. But it doesn't even let me download that many IDs. Am I doing something wrong?
I don't know what exactly you are doing, but I just searched for
protein
and got more than 46K hits. When I asked for a display of IDs in tabular format, it said that at most 25K IDs can be downloaded at a time. It split the list in two as you can see below. Still, I was able to get all of them as two files.PS You may need to right-hand click on image and open it in a new tab in order to see things properly.
Thanks! for some reason if i ask for a custom report with just PDB ids it only let's me download 2500, but doing it directly as you did works fine :) thanks!