Entering edit mode
2.9 years ago
GenesisBio
•
0
Dear All,
I need to align two DNA sequences, and I need to know the mapping between the query and target positions. For example, the 100th nucleotide in the query might align with the 150th nucleotide of the target. I want to know this mapping for all nucleotides. Is there any tool for this?
You can use one of the
EMBOSS
pair-wise alignment tools. They all have web interfaces here. You probably want the local alignment tools if your sequences are not of equal length or similar.Actually, I need to align many gene pairs, so I need a programmatic interface. Besides, I need a program that gives the mapped positions so that for a specific position in the query, I can easily get the aligned position of nucleotide in the hit.
How about using
minimap2
, and output in PAF format with cigar tags? For example, using the SARS-CoV-2 genome with some random nucleotides I pulled out...Output:
https://github.com/lh3/miniasm/blob/master/PAF.md
You'd need to programmatically use the start and end positions in conjunction with the cigar string to figure out the alignment position of the nth nucleotide.
Thanks, Very good suggestion.
You left out that vital piece of information in original post. In any case
EMBOSS
tools can also be run from the command line.