dist.alignment vs dist.dna for DNA distance calculation
1
0
Entering edit mode
8.7 years ago

I am trying to get the proportion of sites that differ between each pair of sequences in my alignment. One approach would be: dist.dna from R package ape with model set as "raw" Another approach is dist.alignment from seqinr which gives the squared root of the pairwise distances. I hoped that the square of the values obtained using dist.alignment would be same as I got using dist.dna. But they are not

Does anyone know the reason for this difference, and which value should I trust?

r seqinr ape distance dna • 7.3k views
ADD COMMENT
0
Entering edit mode

Do you have gaps in your alignment? If so, that can affect the distance calculation. It looks like the two functions you mentioned might have different ways to calculate distances for alignments with gaps.

ADD REPLY
0
Entering edit mode
7.0 years ago
sagitaninta ▴ 20

seqinr::dist.alignment is the square root version of ape::dist.gene, not ape::dist.dna. ape::dist.dna claim to calculate the number of sites that differ between each pair of sequences, whereas ape::dist.gene calculate the distance between each pair of sequences through the number of different sites It does look similar in a glimpse when you look at the documentation, but it does not. I have tried to check this with a simple data set. Imagine two sequences,

Sequence 1: CCTGCA

Sequence 2: TTCXXG

The total number difference is 6 (dist.gene), but the type of different sites are 4 (dist.dna). That is why the values between dist.dna and dist.alignment different. They are calculating different things.

EDIT: This happens when you do not assign an object of class DNAbin to dist.dna; the function calculate different things and keep giving inconsistent results when you repeat it. If you calculate dist.dna with your DNA sequences as an object of class DNAbin, the result will be the non-squarerooted results of dist.alignment. Sorry for not providing a minimum reproducible example, but you can trust me; I have the same problem and have checked all of it with my database.

ADD COMMENT

Login before adding your answer.

Traffic: 2500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6