I'm using Biopython to parse a fastq file, and I found that the SeqIO object get cleared away once I accessed it.
from Bio import SeqIO
record_fastqIO = SeqIO.parse('SRR835775_1.first1000.fastq','fastq')
for record in record_fastqIO:
print(record.id)
This script works perfectly. But if I add one line to the script:
from Bio import SeqIO
record_fastqIO = SeqIO.parse('SRR835775_1.first1000.fastq','fastq')
record_dict = SeqIO.to_dict(record_fastqIO) # this line
for record in record_fastqIO:
print(record.id)
There will be nothing printed out, and there's no erro. Seems like the object record_fastqIO get cleared away after using SeqIO.to_dict()
function.
And also in this script:
from Bio import SeqIO
record_fastqIO = SeqIO.parse('SRR835775_1.first1000.fastq','fastq')
def get_phred_range(fastqIO): # to get the max and min quality
qual_max = []
qual_min = []
for record in fastqIO:
qual_max.append(max(record._per_letter_annotations['phred_quality']))
qual_min.append(min(record._per_letter_annotations['phred_quality']))
phred_max = max(qual_max)
phred_min = min(qual_min)
return phred_max,phred_min
x,y = get_phred_range(record_fastqIO)
print('x,y:%s,%s' % (x,y))
z,w = get_phred_range(record_fastqIO) # exactly the same as x,y
print('z,w:%s,%s' % (z,w))
this will get me:
x,y:41,2
Traceback (most recent call last):
File "c:\Users\zincj\Desktop\Untitled-1.py", line 36, in <module>
z,w = get_phred_range(record_fastqIO)
File "c:\Users\zincj\Desktop\Untitled-1.py", line 12, in get_phred_range
phred_max = max(qual_max)
ValueError: max() arg is an empty sequence
So i'm just doing the same thing twice and the first time things go smoothly. this means there's nothing wrong with my function. but the second time it produces erro.
again it seems like the SeqIO object record_fastqIO
got cleared away after i called it.
Have anyone met this before? Or is there anything wrong with my script?