I have about 100 million reads from a SOLiD run. I am trying to align them using bwa and I got 0 alignments. What am I doing wrong here? Here are the commands that I am using
~/software/bwa-0.5.9/bwa aln -n 6 -t 6 -o 2 -c ~/genomes/hsap/hg19.fa sampleTF5.fastq.gz
~/software/bwa-0.5.9/bwa samse ~/genomes/hsap/hg19.fa sampleTF5.sai sampleTF5.fastq.gz |samtools view -bS -|samtools sort - sampleTF5
About 40% of the reads align using Bioscope so I know that at least some reads should align. The index was created using -c so it is a colorspace index.
ETA: Couple of reads from the fastq file
@853_2_23
T10201001101112312122022330313023.22201032232203002
+
.06%8+23,-/,740&+2,&(*+&26%&%'';!%'(&)':2((,,-'%(.
@853_2_76
T00221112202322220011002232000222000212301132232001
+
&<*(%'?'&'&5)*'%%%&('-'(()-')&)&%)*'/%%&%'%(%&&'&%
what do your reads look like? did you use solid2fastq.pl?
There are a couple of different scripts called solid2fastq.pl floating around: http://kevin-gattaca.blogspot.com/2010/05/plethora-of-solid2fastq-or-csfasta.html The bwa one double-encodes and the BFAST one doesn't, or at least that was the case a while ago.
Yes, I used solid2fastq.pl. The reads are 50 bp long colorspace reads. The quality statistics looked okay with FASTQC.
@853_2_23 T10201001101112312122022330313023.22201032232203002 + .06%8+23,-/,740&+2,&(+&26%&%'';!%'(&)':2((,,-'%(. @853_2_76 T00221112202322220011002232000222000212301132232001 + &<(%'?'&'&5)'%%%&('-'(()-')&)&%)'/%%&%'%(%&&'&%
@853_2_23 T10201001101112312122022330313023.22201032232203002 + .06%8+23,-/,740&+2,&(+&26%&%'';!%'(&)':2((,,-'%(. @853_2_76 T00221112202322220011002232000222000212301132232001 + &<(%'?'&'&5)'%%%&('-'(()-')&)&%)'/%%&%'%(%&&'&%