I know there are several methods to compare two closely related genomes; genotype. But almost all of them compare the stretch of matching sequences. For example blastn table output looks like:
1. qseqid query or source (e.g., gene) sequence id
2. sseqid subject or target (e.g., reference genome) sequence id
3. pident percentage of identical matches
4. length alignment length (sequence overlap)
5. mismatch number of mismatches
6. gapopen number of gap openings
7. qstart start of alignment in query
8. qend end of alignment in query
9. sstart start of alignment in subject
10. send end of alignment in subject
11. evalue expect value
12. bitscore bit score
Although, blast provide one is-to many hits, there are many programs which provide one is-to one best hits. Like MashMap which provides a table of matching hits, with first half column has matching stretch from query (first genotype) and last half has information of corresponding hits from second genotype.
q_chr. len. start. end. match. ori. s_chr. len. start. end. match. %_match
Chr1 9589196 0 4999 4999 + Chr2 2416959 832373 837372 4999 94.2732
Chr1 9589196 10000 49999 39999 - Chr3 6191988 4926167 4964747 38580 94.7695
Chr1 9589196 30000 224999 194999 - Chr4 892816 441386 645465 204079 96.8471
The main demerit of this output/or blast output is the mismatch/indel location information is not reported.
I am looking for a program which compare two genotype (in one is-to one hit) and provide the information of each base in tabular form. The output may look like:
Chr1 1 C Chr1 1 C
Chr1 2 G Chr1 2 A
Chr1 3 G Chr1 3 G
Chr1 4 T insert _
insert _ Chr1 5 C
Chr1 5 G Chr1 6 C
Chr1 6 A Chr1 7 A