Problem with Parse SeqIO SOLVED - link to cross-reference
2
0
Entering edit mode
7.0 years ago
felipelira3 ▴ 40

Anybody have this problem before? Any suggestion about the reason?

The script creates the files containing the genome sequences but it appears at the end of the process.

Line in my script

File "/home/flira/scripts/list_ncbi_download_genome_vs_02.py", line 97, in <module>
    SeqIO.write(SeqIO.parse(genbank_file, "genbank"), genome_file, "fasta")

Warnings that appear:

  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 481, in write
    count = writer_class(fp).write_file(sequences)
  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/Interfaces.py", line 209, in write_file
    count = self.write_records(records)
  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/Interfaces.py", line 193, in write_records
    for record in records:
  File "/usr/lib/python2.7/dist-packages/Bio/SeqIO/__init__.py", line 600, in parse
    for r in i:
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 478, in parse_records
    record = self.parse(handle, do_features)
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 462, in parse
    if self.feed(handle, consumer, do_features):
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 434, in feed
    self._feed_feature_table(consumer, self.parse_features(skip=False))
  File "/usr/lib/python2.7/dist-packages/Bio/GenBank/Scanner.py", line 159, in parse_features
    raise ValueError("Premature end of line during features table")

Link for the same issue in stackoverflow.com https://stackoverflow.com/questions/47792217/seqio-parse-python-premature-end-of-line-during-features-table-solved-answer-i

python • 3.2k views
ADD COMMENT
1
Entering edit mode

Cross-pointed on the stackoverflow

We discourage simultaneously crossposting identical question on multiple sites.

This duplicates the effort of the answerers (they can't see that a question was answered).

And it also spreads out the answers, which makes it harder to other users to track the thread.

ADD REPLY
0
Entering edit mode

Sorry for that but the frequency of responses here has a delay comparing with Stackoverflow and I published there too. For instance, I put the link to both topics and edited the title to solved in both.

ADD REPLY
5
Entering edit mode
7.0 years ago

Philipp Bayer is right - remember to close all the files you open in the script.

This will do the trick:

from Bio import SeqIO

l = ['GCF_000302915.1_Pav631_1.0_genomic.gbff']
for genbank_file in l:
    fh = open(genbank_file)
    oh = open(genbank_file + '.fasta', 'w')
    for seq_record in SeqIO.parse(fh, 'genbank'):
        oh.write(seq_record.format('fasta'))
    oh.close()
    fh.close()
ADD COMMENT
4
Entering edit mode
7.0 years ago

Normally this should work (and it does on my system). Are you writing to the genbank_file in the script before that? Perhaps you haven't closed the file handle yet so that writing to the file hasn't synced?

ADD COMMENT

Login before adding your answer.

Traffic: 2742 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6