Question

BWA-mem - Different alignment numbers to same reference sequence

1

Entering edit mode

2.4 years ago

Steven ▴ 20

My first post on here.........

I am using BWA-mem in quite an unusual situation, I am aligning small sequences of around 150bp to small reference sequences of variable length. When I give a reference file with multiple similar reference sequences my overall alignment rates increase considerably.

ref1.fasta:

>seq1
CAGGCTCTGCTCTTCATAATCATACCTTTGTGACTCAGGATGCTGT

>seq2
CAGGCTCTGCTCTTAATATCTGGCCGTCGTATTCCACCTCTGCGACTCATGATGCTGT (100,000 aligned)

>seq3
CAGGCTCTGCTCTTCATAATTTCTATCTTGCCCACCCTACTCGACACAGAGCAAAAATCCAACACTCCCAATATTGCCGTGGCTTCGACCTCTTGCTCAGATTTTCTTGTTACCTTTGTGACTCAGGATGCTGT

>seq4
CAGGCTCTGCTCTTCATAACCCTCCCTGCGAGTCCTTAAGTCTGACTCGGATCCTTAAACAACCTTTTCTTACCTTTGTGACTCAGGATGCTGT

ref2.fasta:

>seq2
CAGGCTCTGCTCTTAATATCTGGCCGTCGTATTCCACCTCTGCGACTCATGATGCTGT (25,000 aligned)

My fastq files align at higher numbers to ref1.fasta than ref2.fasta, but allow a far greater number of deletions and mis-matches with ref1.fasta.

I realize this is not what BWA-mem was really designed to do, but would be really grateful if you could help explain this activity, could it be something to do with the initial seeding of the alignment?

Many thanks, Steve W.

BWA-mem • 1.1k views

ADD COMMENT • link 2.4 years ago by Steven ▴ 20

0

Entering edit mode

Are those 75k reads longer than seq2?

ADD REPLY • link 2.4 years ago by swbarnes2 14k

0

Entering edit mode

Likely since OP says

small sequences of around 150bp

ADD REPLY • link 2.4 years ago by GenoMax 147k

0

Entering edit mode

Yes my reads are around double the length of seq2, with a maximum of 150bp.

ADD REPLY • link 2.4 years ago by Steven ▴ 20

0

Entering edit mode

May be of interest: https://jeremy9959.net/Blog/TheMEMinBWAMem-fixed/

ADD REPLY • link 2.4 years ago by GenoMax 147k

0

Entering edit mode

Thank you, I think I came across this one when I was searching for an answer, I'm fairly sure the seeding has something to do with this strange behavior, but I'm yet to pin-point exactly what the cause is.

ADD REPLY • link 2.4 years ago by Steven ▴ 20