Entering edit mode
3.5 years ago
Gon
▴
10
Hi everyone,
I need to calculate the pairwise SNP distances for sequences in a multi fasta alignment. So far I'm using snp-dists
(https://github.com/tseemann/snp-dists) which is great but excludes ambiguous nucleotides (so 2 sequences differing only by one ambiguous nucleotide would have a distance of zero). There is an option to include everything but that includes not only ambiguous but also Ns and dashes.
Do you know if there's any method (ideally unix or python-based) to disregard Ns and dashes but considering ambiguous nucleotides as different nucleotides?
Thanks a lot,
Gonzalo