Entering edit mode
8.3 years ago
roma
▴
120
I have a set of reads which do not have great Phred scores.
To see how possible base call errors would affect the downstream analysis, I'm thinking about generating a bunch of modified read sets where bases are possibly altered with the probability specified by the Phred score. Does a tool like this already exist? If not, do you think it'd be a good idea to write one?
Most read generators can do what you want. What sort of experiment is this though?
It's a 16s rRNA sequencing experiment performed on Ion Torrent. Could you be a bit more specific about read generators? For instance, I looked at ART's readme but couldn't find this feature.
I'm not familiar with anything specific for ion torrent. I'm presume there's stuff out there, though.
To be clear, I don't expect the tool to be specific to Ion Torrent. All it has to do is to consume a fastq file and mutate it based on the Phred score probabilities.
Just run FastQC on the fastq file and input the average score profile into Sherman or a similar read simulator then.
I've been considering writing something that does this for a while (actually, modifying RandomReads so that it can accept and use q-scores from real reads instead of generating its own), but I haven't yet. It's important to note, though, that Illumina's quality scores are not very accurate; if you want to see the effect of errors by applying them at the actual Illumina error rate, I recommend recalibrating the quality scores first.
Interesting, thanks. Do you know anything about Ion Torrent quality scores?
Sorry, no. The same method will work for Ion Torrent reads, but I don't have any experience with them...
Hi roma!
did you find some tool that have this feature or did you write your own? (looking to do the exact same thing) :-)
Hi Joseph!
Yes, I wrote my own. You are welcome to use it: https://github.com/feuerbach/reifa