Entering edit mode
9.2 years ago
colonppg
▴
120
Folks:
I ran fastdump split-3 and got fastq files
when I run tophat:
tophat-2.0.9.Linux_x86_64/tophat \
-r 150 \
-p 3 \
--solexa1.3-quals \
--fusion-search \
-o ./tmp \
--GTF ./genomes/human/annotation/Homo_sapiens.GRCh37.68.fix.gtf \
./genomes/human/hg19 \
./SRR1265510_1.fastq ./SRR1265510_2.fastq &
Got error message:
Error running 'prep_reads'
terminate called after throwing an instance of 'int'
I was surprised to find the .fastq files I got from split-3 does not have qc format information...
head SRR1265510_1.fastq
@SRR1265510.1 1 length=101
AGGGCATCTCTGGGAAAGGACCTGGGGCTGGTGAGGGGCCCGGAGGAGCCTTTGCCCGCGTGTCAGACTCCATCCCTCCTCTGCCGCCACCGCAGCAGCCC
+SRR1265510.1 1 length=101
@CCFFD?DHHHHGIGIIIIFHI@HICGGGHIDGGDHAEF6@FG@BE1??B;@CCAAC>>B8?&844::(4@ACAABCCBCC:4:@>99<525&&0&2?BB#
@SRR1265510.2 2 length=101
CGCAAGGGCATCTCTGGGAAAGGACCTGGGGCTGGTGACGGGCCCGGAGGAGCCTTTGCCCGCGTGTCAGACTCCATCCCTCCTCTGCCGCCACCGCAGCA
+SRR1265510.2 2 length=101
CCCFFFFFHHHHHJJJJJJJJJJIJJIJJJJJJJJAFHIJJIIJIIHHFDDDDDDDDDDDDDDD>B>BCDDDDDDCDCDDDDBACCCCCBDDDDDDDDDDB
What could be the cause of this? Shall I run fastq-dump again using --split-files
?
What do you mean by "does not have qc format information"? The lines in bold below are the quality score lines.
@SRR1265510.1 1 length=101
AGGGCATCTCTGGGAAAGGACCTGGGGCTGGTGAGGGGCCCGGAGGAGCCTTTGCCCGCGTGTCAGACTCCATCCCTCCTCTGCCGCCACCGCAGCAGCCC
+SRR1265510.1 1 length=101
@CCFFD?DHHHHGIGIIIIFHI@HICGGGHIDGGDHAEF6@FG@BE1??B;@CCAAC>>B8?&844::(4@ACAABCCBCC:4:@>99<525&&0&2?BB#
@SRR1265510.2 2 length=101
CGCAAGGGCATCTCTGGGAAAGGACCTGGGGCTGGTGACGGGCCCGGAGGAGCCTTTGCCCGCGTGTCAGACTCCATCCCTCCTCTGCCGCCACCGCAGCA
+SRR1265510.2 2 length=101
CCCFFFFFHHHHHJJJJJJJJJJIJJIJJJJJJJJAFHIJJIIJIIHHFDDDDDDDDDDDDDDD>B>BCDDDDDDCDCDDDDBACCCCCBDDDDDDDDDDB
Thanks for your reply, what I mean is it lacks QC encoding info as the
################
section usually contains that info... this cause tophat stop working