I really appreciate the (post martel) biopython parsing abilities of SeqIO.
Question: is the circular vs linear info from first line (LOCUS) tossed when parsing a genbank genome (complete sequence) file?
eg.
LOCUS NC_001337 11014 bp DNA circular BCT 18-JUL-2008
DEFINITION Methanothermobacter thermautotrophicus Z-245 plasmid pFZ1, complete sequence.
It seems strange, but I can't find it anywhere in the SeqRecord. Not a big deal of course because I can parse it myself, but curious why I can't find it. Maybe not always present?
OMG! I think you could be right! GenBank only stores genomic topology in the LOCUS line! I've just checked with a real sequence. No sign of LOCUS anywhere. It seems that write_first_line uses record.name, too! So, if SeqIO you gbfile you'll lose that informatio! WOW!