Question

Bowtie2 end-to-end alignment seems does not work for Fasta files?

1

Entering edit mode

7.9 years ago

m.koohi.m ▴ 120

Hi, I use bowtie2-2.2.9 to align Fasta reads with some genes. I don't know I understand correctly end-to-end alignment in Bowtie2 or not. Based on my understanding if we have a read same as bellow:

>Read1
TGCGGAATTTGATACACGTACATAAGTACGTGTTGGCTTATGCTTGCGTACGCTGAAACATGCTGACCTTTTTTTAAAACGCCCTTGTC

And we use end-to-end (it seems default option) in our alignment the aligning should involves all the characters in the read. But in my result I have some local aligning. Same as bellow that just use 8 character of read in alignment.

Read1   16  Gene.1  19  1   8M  *   0   0   TAAAAAAA    IIIIIIII    AS:i:0  X

I run the Bowtie command with these options:

bowtie2 -f -x RefGene -U merged.fasta -S output.txt -p 6 --no-hd --no-sq --no-unal

Also I am sure that the length of Read1 is longer than 8. It is 88. I am wondering if I need to add any option in running bowtie2 to force it to align end-to-end?

Bowtie2 software error alignment • 3.3k views

ADD COMMENT • link 7.9 years ago by m.koohi.m ▴ 120

2

Entering edit mode

I added markup to your post for increased readability. You can do this by selecting the text and clicking the 101010 button. When you compose or edit a post that button is in your toolbar, see image below:

101010 Button

ADD REPLY • link 7.9 years ago by WouterDeCoster 48k

1

Entering edit mode

Any specific reason to use bowtie2 with fasta reads, instead of something like blat? You should also include exact command line you used for the bowtie2 alignment to provide full context.

ADD REPLY • link 7.9 years ago by GenoMax 151k

0

Entering edit mode

I always thought that bowtie2 aligned end to end by default and that you would need to pass extra parameters to make it work differently.

In your example note how you don't have clipping on the CIGAR string. This implies that your original sequence was just 8bp long. But then I don't think bowtie2 actually works for sequences that short. In addition the reported sequence cannot be found in the read that you wrote above. So ... lots of inconsistencies there...

Show both the command that you are running and the actual line that gets reported.

ADD REPLY • link 7.9 years ago by Istvan Albert 102k

0

Entering edit mode

@WouterDeCoster Thanks for suggestion and tutorial.

@genomax I am processing metagenomics files and found Bowtie2 too fast. I really didn't tried Blat. You think it is as fast as Bowtie2?

@Istvan Albert Thanks for your comment. Actually I am sure that the length of read is not 8. It is 88. I updated my question.

ADD REPLY • link 7.9 years ago by m.koohi.m ▴ 120

0

Entering edit mode

Your alignment shows a sequence TAAAAAAA that is not present in the read that you show.

In addition when an alignment takes place aligners will indicate how much of the read is clipped with the S or H letters. It is strange that your SAM does not do that. In addition the alignment line that you report is incomplete, note how it ends with X and does not show an MD tag.

You should show the complete SAM record and show the complete input sequence. Right now it still looks like some sort of inconsistency regarding either the data or the alignment. Hence we cannot troubleshoot it.

ADD REPLY • link 7.9 years ago by Istvan Albert 102k

0

Entering edit mode

16 in second position of SAM record means the alignment is reverse. Reverse of "TAAAAAAA" is "TTTTTTTA" that present in read.

ADD REPLY • link 7.9 years ago by m.koohi.m ▴ 120

0

Entering edit mode

Ah indeed, good point, the sequences are always reported on the forward strand. I missed that.

ADD REPLY • link 7.9 years ago by Istvan Albert 102k