Entering edit mode
4.5 years ago
nvijay1991
•
0
Hi All,
When I'm trying to download the pdb using biopython. I end up having files with "pdb...ent.gz" , usually the ".gz" as you all might know is a gunzip/compressed file. but when the program trying to download I get below error.
Code:
pdbl.download_pdb_files(pdb_ids_list, obsolete=False, pdir=None, file_format='pdb', overwrite=False)
Error:
/anno_prod/tools/external/biopython-1.76/Bio/__init__.py:128: BiopythonWarning: You may be importing Biopython from inside the source tree. This is bad practice and might lead to downstream issues. In particular, you might encounter ImportErrors due to missing compiled C extensions. We recommend that you try running your code from outside the source tree. If you are outside the source tree then you have a setup.py file in an unexpected directory: /anno_prod/tools/external/biopython-1.76.
format(_parent_dir), BiopythonWarning)
Traceback (most recent call last):
File "/anno_prod/data-db/public/PDB/pdb_update_17052020/pdb_retrievel.py", line 25, in <module>
pdbl.download_pdb_files(pdb_ids_list, obsolete=False, pdir='/anno_prod/data-db/public/PDB/pdb_update_17052020/latest_pdb_18052020/', file_format='pdb', overwrite=False)
File "/anno_prod/tools/external/biopython-1.76/Bio/PDB/PDBList.py", line 446, in download_pdb_files
overwrite=overwrite,
File "/anno_prod/tools/external/biopython-1.76/Bio/PDB/PDBList.py", line 347, in retrieve_pdb_file
out.writelines(gz)
File "/anno_prod/tools/external/anaconda/lib/python2.7/gzip.py", line 463, in readline
c = self.read(readsize)
File "/anno_prod/tools/external/anaconda/lib/python2.7/gzip.py", line 267, in read
self._read(readsize)
File "/anno_prod/tools/external/anaconda/lib/python2.7/gzip.py", line 302, in _read
self._read_gzip_header()
File "/anno_prod/tools/external/anaconda/lib/python2.7/gzip.py", line 196, in _read_gzip_header
raise IOError, 'Not a gzipped file'
IOError: Not a gzipped file
Thanks,
Vijay N
1) python 2.7 is deprecated 2) did you validate the compressed PDB exists?
Thanks JC..! I got the issue solved. Its because of an additional character in my PDB id. I have removed and it works fine now. E.g 2ERF --> 2ERFg -> this particular char "g" has made the download of pdb file 2ERFg.ent.gz ,which is actually a dummy file and gives us the not a gunzipped file error. I don't understand why this particular basic preliminary check was not done by BioPython for a restricted 4 char entry for retreiving the PDB file.
Thanks for your reply.
-Vijay N