Question

How To Differentiate Files With One Record From Files With Multiple Records?

0

Entering edit mode

11.3 years ago

Cako Fuentes • 0

I'm working with biopython, python, and gtk to create a program to load files of bioinformatic interest.

These files have multiple sequence in them

http://biopython.org/DIST/docs/tutorial/examples/ls_orchid.gbk

http://biopython.org/DIST/docs/tutorial/examples/ls_orchid.fasta

but this ones only have one (long) sequence.

http://biopython.org/SRC/biopython/Tests/GenBank/NC_005816.gb

http://biopython.org/SRC/biopython/Tests/GenBank/NC_005816.fna

Is there any way to know this before processing the file? How to differentiate the ones with one sequence from others with multiple sequences? I want to know when to use exactly Bio.SeqIO.read() or Bio.SeqIO.parse()

Thanks for your time, I tried to search for answers, but I didn't find something similar to this.

fasta python genbank biopython • 3.0k views

ADD COMMENT • link updated 2.1 years ago by Ram 45k • written 11.3 years ago by Cako Fuentes • 0

2

Entering edit mode

is there any way to know this before processing the file?

You'd have to process it somehow to determine whether the file contains one or multiple sequences. Given this, consider using Bio.SeqIO.parse(), since it handles both cases.

ADD REPLY • link 11.3 years ago by Kenosis ★ 1.3k

score 2 · Answer 1 · 2014-01-16

2

Entering edit mode

11.3 years ago

Peter 6.0k

If you don't know how many records there are, assume at least one, and use Bio.SeqIO.parse() with a for loop. If the file happens to have only one record, your code will just do the for loop once. Easy :)

ADD COMMENT • link 11.3 years ago by Peter 6.0k

0

Entering edit mode

Thanks, i'm testing the loading times for different files, just wanna go with the most optimized code.

ADD REPLY • link 11.3 years ago by Cako Fuentes • 0

0

Entering edit mode

Well internally Bio.SeqIO.read() calls Bio.SeqIO.parse() anyway, and checks there was exactly one record.

ADD REPLY • link 11.3 years ago by Peter 6.0k