I've seen numerical vectors used for nucleotides when, for example, developing SVM models for nucleotide sequence features. An example would be this study: "Prediction of polyadenylation signals in human DNA sequences using nucleotide frequencies". It's probably easiest to quote from the methods section:
Binary pattern: In the case of binary pattern each nucleotide was represented by a vector of four dimensions such as A by [1,0,0,0], C by [0,1,0,0], G by [0,0,1,0] and T by [0,0,0,1]. Thus a sequence of 200 nucleotides was represented by a vector of 800 (4 × 200) dimensions, which means UP100 and DW100 were both represented by vectors of 400 (4 × 100) dimensions.
Simple nucleotide frequency: In this case we calculated nucleotide frequencies of 100 upstream (UP100) and 100 downstream (DW100) positions, relative to poly(A) signals, separately and further added them to one another so that the total dimension is double. For instance, the sequence of 100 upstream was represented by a vector of four dimensions using mononucleotide frequency (frequency of A, T, G and C). In the case of dinucleotide frequency (AA, AC, AG, CG, AT ..), the sequence was represented by a 16-dimensional vector. Similarly, the sequence was represented by a vector of 64 dimensions in case of trinucleotides and by a vector of 256 dimensions in the case of tetranucleotides.
I think it's an approach used more commonly for protein sequences, because amino acids have more physicochemical properties that can be described using vectors.
Can you elaborate on what are you trying to use this data for? Also, I do not really see why the paper is trying to predict protein families with autocorrelation, when there are more promising (structure- and sequence homology based) methods available.
Decriptors = features derived from sequence or structure data ?
from sequence data