Entering edit mode
10.7 years ago
Clare
▴
180
I used SRA toolkit (sratoolkit.2.3.4-2-mac64) to convert some paired end read data from .sra to .fastq using: fastq-dump -I --split-files SRR1062090.sra
The default option is Q33 quality scores - but checking the outfile I see the quality scores were clearly illumina 1.5.
@SRR1062090.1.1 BRISCOE:7:1:1234:2056 length=101
ATTTCTCTCTCTTCAGAAACTTTAAAGGTTTCTGTTGAGGCGAGTGCTTGGATTCTCATANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCAAGGAGTG
+SRR1062090.1.1 BRISCOE:7:1:1234:2056 length=101
cffdfffeffffeffc`edeeefffe`e_c`cccacc`acY`T`YbT`Y^]\]`]]^BBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBBB
I tried adding the option -Q 33 but there was no change. I then tried -Q64 and the quality scores then were up to Phred 126.
@SRR1062090.1.1 BRISCOE:7:1:1234:2056 length=101
ATTTCTCTCTCTTCAGAAACTTTAAAGGTTTCTGTTGAGGCGAGTGCTTGGATTCTCATANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNCAAGGAGTG
+SRR1062090.1.1 BRISCOE:7:1:1234:2056 length=101
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~x~s~x~s~x}|{|~||}aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
Other files from the same project produce the same problem.
Does anyone know the cause of this and how I might fix it?
Also - The command works on data from a different project - i.e. gives me Q33. So it must be a data issue with this specific data project.
I suspect that the group just screwed up when they made the SRA files. If that's the case, then just use the Q33 fastq files and tell your aligner that they're phred+64, which would seem is what they are. If you really need them converted, then check out this thread: Convert Illumina reads to Sanger score .