Hey everyone,
I'm having issue aligning FASTQ reads to a reference genome using bowtie. I'm using a small set of good quality FASTQ reads which definitely align to this genome. I've been trying variations of the following command, which works for a colleague using an older version of bowtie (they're using v1.0.0 IIRC. I'm using v1.2.3, installed via conda):
bowtie -p 8 -n 3 -m 10 --sam bowtie_genome_index/genome_index $filename.fastq $filename.sam
I've run this command with the --verbose
flag to see if anything is amiss. The input ebwt
file is recognised, the FASTQ query input is recognised, there are no "Quality inputs", and the output file is $filename.sam
. All 80 reads in my test FASTQ file are recognised, but none of them are aligning to the genome:
# reads processed: 80
# reads with at least one reported alignment: 0 (0.00%)
# reads that failed to align: 80 (100.00%)
No alignments
I've read over the github documentation for bowtie, and I'm not sure why this isn't working.
Any help would be much appreciated, thank you in advance.
UPDATE: Turns out the issue was with fastq-dump
, not bowtie
. I had downloaded the FASTQ reads from SRA using the default fastq-dump
parameters, which meant that the forward and reverse reads weren't separated. Using --split-files
or --split-3
with fastq-dump
fixed this issue.
What is the length of these reads?
bowtie v.1.x
does non-gapped alignments so if your reads are very long it is possible they are not being aligned for that reason.Hey, I just ran
fastqc
on my test FASTQ file: the reads are 24 to 101bp long. The 101bp read is an outlier, most of the reads are 34 to 47bp in length.Have you trimmed these (likely since they are not all same length) to remove adapters etc. That may be another thing to check. You have also taken a few and blasted them at NCBI to make sure they are what you think they are (unless this is a test subset that you already know should work).
This test subset should work, the reads are trimmed and the adapters have been removed, which was confirmed using
fastqc
. I've also tried another copy of this test subset with the largest read (101bp) removed, and still, none of the reads align. After a few quick BLAST searches, the FASTQ reads and reference genome do look correct. I think I'll try my luck withbwa
.