Hi,
I have a list of several thousand proteins and their UNIPROT IDs. I'm looking for an efficient method of cross-referencing it against the PDB tertiary structure database, and get a list of those proteins with a tertiary structure in the PDB database.
I've tried to BLASTP the list of UNIPROT IDs against the PDB database, using the NCBI BLAST portal but encountered too many errors of "Error: Failed to read the Blast query: Sequence ID not found", making the process of manual filtering not convenient and not efficient.
Any ideas?
Thank you!
Use UniProt ID converter to map them to PDB ID's. That can give you an idea of how many are present in PDB. From there you can start looking for things with a tertiary structure.
there are several similar posts in Biostars,
see this one below and the right panel:
Protein PDB ID