BWA-mem - Different alignment numbers to same reference sequence
0
1
Entering edit mode
2.4 years ago
Steven ▴ 20

My first post on here.........

I am using BWA-mem in quite an unusual situation, I am aligning small sequences of around 150bp to small reference sequences of variable length. When I give a reference file with multiple similar reference sequences my overall alignment rates increase considerably.

ref1.fasta:

>seq1
CAGGCTCTGCTCTTCATAATCATACCTTTGTGACTCAGGATGCTGT

>seq2
CAGGCTCTGCTCTTAATATCTGGCCGTCGTATTCCACCTCTGCGACTCATGATGCTGT (100,000 aligned)

>seq3
CAGGCTCTGCTCTTCATAATTTCTATCTTGCCCACCCTACTCGACACAGAGCAAAAATCCAACACTCCCAATATTGCCGTGGCTTCGACCTCTTGCTCAGATTTTCTTGTTACCTTTGTGACTCAGGATGCTGT

>seq4
CAGGCTCTGCTCTTCATAACCCTCCCTGCGAGTCCTTAAGTCTGACTCGGATCCTTAAACAACCTTTTCTTACCTTTGTGACTCAGGATGCTGT

ref2.fasta:

>seq2
CAGGCTCTGCTCTTAATATCTGGCCGTCGTATTCCACCTCTGCGACTCATGATGCTGT (25,000 aligned)

My fastq files align at higher numbers to ref1.fasta than ref2.fasta, but allow a far greater number of deletions and mis-matches with ref1.fasta.

I realize this is not what BWA-mem was really designed to do, but would be really grateful if you could help explain this activity, could it be something to do with the initial seeding of the alignment?

Many thanks, Steve W.

BWA-mem • 1.1k views
ADD COMMENT
0
Entering edit mode

Are those 75k reads longer than seq2?

ADD REPLY
0
Entering edit mode

Likely since OP says

small sequences of around 150bp

ADD REPLY
0
Entering edit mode

Yes my reads are around double the length of seq2, with a maximum of 150bp.

ADD REPLY
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thank you, I think I came across this one when I was searching for an answer, I'm fairly sure the seeding has something to do with this strange behavior, but I'm yet to pin-point exactly what the cause is.

ADD REPLY

Login before adding your answer.

Traffic: 1869 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6