Entering edit mode
8.1 years ago
themantalope
▴
40
Hi All,
I have a .dat file that follows the formatting of the Swissprot sequence format file, and I'm trying to read it using Biopython's SeqIO
module. However, when I try to extract records from the file I get the following error:
>>> reqs = list(SeqIO.parse("5UTRaspic.Hum.dat", "swiss"))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SeqIO/__init__.py", line 600, in parse
for r in i:
File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SeqIO/SwissIO.py", line 85, in SwissIterator
for swiss_record in swiss_records:
File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SwissProt/__init__.py", line 121, in parse
record = _read(handle)
File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SwissProt/__init__.py", line 165, in _read
_read_id(record, line)
File "/Users/<usr>/anaconda/lib/python2.7/site-packages/Bio/SwissProt/__init__.py", line 278, in _read_id
raise ValueError("ID line has unrecognised format:\n" + line)
ValueError: ID line has unrecognised format:
ID 5HSAA000001; SV 1; linear; mRNA; STD; HUM; 62 BP.
The .dat file I'm using is the file which can be found here (human 3'UTR database). From what I can tell, it looks like it is formatted properly. Is there any modification I can make to the file so that it adheres with the standard expected by Biopython?