So I have my FASTQ ran through Tophat and produced unmapped and mapped bam files, I would like to increase the quality score for all mapped sequences, and then use TopHat to remap them, What is the easiest and best way of doing it?
So I have my FASTQ ran through Tophat and produced unmapped and mapped bam files, I would like to increase the quality score for all mapped sequences, and then use TopHat to remap them, What is the easiest and best way of doing it?
If you are able to use Julia (http://julialang.org/), you can use OpenGene (https://github.com/OpenGene/OpenGene.jl) to do this easily
Following code is tested:
using OpenGene # no quality can be higher than K const QUAL_LIMIT = 'K' # how much you want to increase it const QUAL_INC = 10 # R1.fq is the file name of your fastq istream = fastq_open("R1.fq") ostream = fastq_open("R1.out.fq","w") # fastq_read will return an object FastqRead {name, sequence, strand, quality} # fastq_write can write a FastqRead into a ouput stream while (fq = fastq_read(istream))!=false qual_arr = [q for q in fq.quality.qual] for i in 1:length(qual_arr) qual_arr[i] = min(qual_arr[i]+QUAL_INC, QUAL_LIMIT) end fq.quality.qual = ASCIIString(qual_arr) fastq_write(ostream, fq) end close(ostream)
You can install Julia by sudo apt-get install julia
if you are running Ubuntu, after you install it, just change the file names in the code, launch julia, paste the code and press ENTER.
Thanks, so your suggesting to first convert it to a "FASTQ" file (I am assuming tools like bedtools) and then manipulate it from there? I dont have Julia but I can use Matlab to do the same thing you are suggesting. My last question would be: you are converting ASCII to integer and add 10 to it ? can I ask why 10 ?
I see. You could use fastq command from samtools to get the fastq reads. You may need to How To Filter Mapped Reads With Samtools before doing the conversion.
I installed julia on my Fedora machine and just paste the code which was mentioned above. I get this error " ERROR: UndefVarError: fastq_open not defined ". So kindly help.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Why do you want to do this? One should never change primary data (or at least not without a good reason).
It is only an experiment, to see how it is gonna affect the result ? can you kind of predict what would be the result ?
If a read has already aligned that alignment would not change by changing the Q-score.
Exactly, only if TopHat is working as expected !