bowtie issues: unable to map significant amount of pairs
2
0
Entering edit mode
9.6 years ago

I am trying to map Illumina paired-end reads of the Bacillus subtilus natto genome to the reference genome. Everything is running smoothly, but I am only able to get 50 mapped reads at best. Here are my commands:

./fastq-dump --split-files ~/DRR000001.sra #Gives Forward and Reverse files
./fastq_quality_filter -Q33 -q 30 -p 70 -i ~/DRR000001_1.fastq -o ~/natto1_trim.fastq
./fastq_quality_filter -Q33 -q 30 -p 70 -i ~/DRR000001_2.fastq -o ~/natto2_trim.fastq
bowtie-build -f bsubtilisnatto.fasta natto
bowtie natto -1 natto1_trim.fastq -2 natto2_trim.fastq pair.sam --al pair.align --un pair.unmapped -l 20 -X 300 --fr -q -t --sam -n 3 --solexa-quals -p 8

The filter leaves about 700,000 and 400,000 reads. The reference fasta file is about 4 Mbp. This may be low, but surely it's not 50 mapped reads low. I've tinkered with the bowtie command, using --rf, --ff, lowering thresholds, omitting --solexa-quals, all to no avail.

Your help would be much appreciated!

Illumina Assembly Bowtie • 2.2k views
ADD COMMENT
0
Entering edit mode

The filter leaves about 700,000 and 400,000 reads

Does it mean that the reads in the two fastq file is not the same after filtering, i.e. the two files are no longer synchronized? If so, this is definitely wrong. If you remove a read from one fastq file you should remove that read also in the other file. Use an adapter and/or quality trimmer that can handle paired files, like cutadapt. Also, I wouldn't use bowtie, if anything bowtie2 or bwa mem (if read longer than 70bp).

ADD REPLY
0
Entering edit mode
9.6 years ago

You may need to reorder the reads in both the fastq files after quality filter.

ADD COMMENT
0
Entering edit mode

Thank you -- any tips on how I do this?

ADD REPLY
0
Entering edit mode

Though you can reorder reads with repair.sh, assuming they have standard Illumina names, I'd suggest just starting over with BBDuk instead for quality-trimming:

bbduk.sh -Xmx1g in=read1.fq in=read2.fq out1=clean1.fq out2=clean2.fq qtrim=rl trimq=10
ADD REPLY
0
Entering edit mode
9.6 years ago

Bowtie cannot handle indels at all, and only up to 3 mismatches. If you are mapping against a different strain, for example, perhaps almost nothing would map. Also, if the insert sizes were short, such that reads have adapter sequence, nothing would map (kind of unlikely with 50bp reads, though). So typically, I'd advise you to try a more sensitive aligner such as BBMap... but it seems likely that in this case, in light of the fact that only 50 reads mapped, you are mapping against the wrong genome. Take a dozen reads or so and BLAST them against nt or something to ensure that they are what you expect, and do the same with the reference, because it's likely that they don't go together.

Oh - and as Geek_y mentioned, the quality filtering you did will probably have destroyed the pairing. I suggest you use pair-aware quality-filtering programs such as BBDuk and process both files at once, to keep pairs together. That said, Q30 is very high (too high, IMO) for filtering. I don't think that reordering will solve the issue, though, as presumably bowtie would have still mapped many reads single-ended.

ADD COMMENT
0
Entering edit mode

Thanks. I tried the process with a different set of paired-end Illumina reads and its respective genome which I got the exact strain of. Still the same issue. and with bwa I get even less mapping. I'm beginning to think it is my trimmer.

ADD REPLY

Login before adding your answer.

Traffic: 1469 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6