Question

Mapping vs Alignment

1

Entering edit mode

17 months ago

ExtentHonest56 ▴ 20

I've been reading up on reference mapping vs reference alignment and some are saying that mapping is where you find the general region of a sequence while with alignment all bases need to match...but then I am also seeing some papers using these terms interchangeably, even directly stating mapping is also referred to as alignment.

Is there actually a difference? Or are these two techniques the same? Each time I feel I am getting closer to understanding the difference, I read something else and they seem the same again.

I also saw somewhere that mapping is a part of alignment. If someone could kindly clarify or point me towards a reputable paper from a recent year noting the difference (if any) this would be much appreciated.

sequencing • 2.7k views

ADD COMMENT • link 17 months ago by ExtentHonest56 ▴ 20

score 6 · Accepted Answer · 2023-07-05

Unfortunately, the terminology has been largely confused and used in an imprecise way in the literature. Many people use the terms mapping and alignment interchangeably, and contemporary use of either term itself will not convey a fine-grained understanding to the reader. When you use these terms in your own documentation or writing, just be precise about what exactly was carried out or what your algorithm does.

That being said, these have conventionally been distinct terms representing distinct ideas. On of the first clear definitions I found was due to a 2011 presentation by Heng Li : enter image description here

That is, the distinction is that mapping equates, essentially to localizing a read — finding out where it arises from, while aligning implies drawing a detailed correspondence between each nucleotide of the read with that of the reference (or mapping them to insertions or deletions). To this end, when people use the terms loosely, they will often (though not always) do so in an asymmetric way that is compatible with this. That is, it's common for people to say "mapping" when they mean alignment, but much less common for people to say "alignment" when they mean mapping. This also generally follows how these steps are carried out in many algorithms, where a read is first approximately localized (mapped), and then a detailed alignment is carried out at this locus via a more computationally intensive dynamic programming approach.

This separation of phases allows for simultaneous development of new and better approaches to both of these problems in modular ways that can be combined. For example, the recent mapquik paper describes improved algorithms for seeding and chaining to efficiently localize long-reads, while papers like the recent biwfa paper describe advancements in algorithms for computing nucleotide-to-nucleotide alignments.

Nonetheless, while these distinctions have been drawn somewhat clearly in the past, and these terms were intended to have distinct meanings, contemporary usage is not always consistent and these terms are often used interchangeably. So, when you discuss specific algorithms, tools, or workflows, just be sure to be explicit about what is being computed and what the output is, rather than trying to rely on the terms mapping or alignment to do that work for you.