Entering edit mode
8.4 years ago
Mahan
▴
70
I have genotype values like AA GG CC etc - How can I convert these values into 0, 1, 2? I'm using SAS to process my data.
I have genotype values like AA GG CC etc - How can I convert these values into 0, 1, 2? I'm using SAS to process my data.
Just replace A by 0, G by 1, C by 2. The symbols A, G, C denote three possible observable states at a certain position of a genomic sequence. But please note, that in genetics the term genotype is not a synonym for sequence.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I don't think this is correct. If you're trying to get the data into VCF, 0 is used to denote the reference base, 1 the first alternative and 2 the second alternative. You will want to in some way detect what the reference is, then assign the values to the genotype based on which allele it is. The numbers are not usually arbitrarily assigned, as is suggested here.
Note that I don't know how to do what you're asking, but I am fairly sure that this answer is incorrect.
As fare as I understand, the OP wants to use the general purpose statistics software SAS. This software knows nothing about DNA sequences, and thus he wants to transform characters into integers. This does not make any difference for the calculation of uncorrected edit distances. I would recommend to use a specialized phylogenetic software, but since there is so many in all different flavours, it is hard to choose a particular one.