Hello, I am reading and writing some fastq files, and am getting an unexpected error. The steps my code takes are
1.Read 3 FASTQ files, end trim for quality, and write to another 3 FASTQ files. Uses FastqGeneralIterator 2. Read from result 3 FASTQ files, throw some way based on some condition, and write to another file. Uses FastqGeneralIterator . The corresponding code is -
for title,rseq,quals in FastqGeneralIterator(read_handle)
...doSomething
At step 2 I get an error
Traceback (most recent call last):
File ".../N...py", line 120, in do_filtering
for title,rseq,quals in FastqGeneralIterator(read_handle) :
File "/usr/lib/python2.7/site-packages/Bio/SeqIO/QualityIO.py", line 891, in FastqGeneralIterator
raise ValueError("End of file without quality information.")
ValueError: End of file without quality information.
This error is thrown after the whole file is parsed, at the step of the last record. That area of the FASTQ file looks like (The formatting is not coming out proper below..but I have checked it, and it definitely is alright.
@RBXOT:5:11
AGGTATACATGGGTTCT...
+
=?7CC<CCCCC...
@RBXOT:5:12
AGGTATACATGGGTT..
+
>@8CC?CABACE9C<B...
I have checked these records. The sequence string and quality string are of the same length. Also, I am explicitly printing a "\n" at the end of the file when the original file was created. Further, there were 3 files. The code may fail at a different file each time. But it definitely does fail.
Any idea what could be the matter? Thanks very much.
Since the formatting of the example is messed up, how about posting the FASTQ snippet using gist or something? https://gist.github.com/
Also check the exact contents with hexdump, e.g. at the command line you could try:
and copy paste that output. That may show up issues with different new lines (Unix vs Windows).