Alignment stats and SNP finding
1
1
Entering edit mode
8.1 years ago
Macherki M E ▴ 120

I tried to create a function to extract SNP from fasta file witch content alignments sequences. Well, I create a function that make statistics for a list of N aligned sequences. Using Rcpp api, the code is:

// [[Rcpp::export]]
 NumericMatrix alin_stat(CharacterVector &alin,int &N_sequence)
   {
   int sequence_size=alin[0].size();
   NumericMatrix stat_m (5,sequence_size);   // set at 0 by default

   for(int i=0;i<N_sequence;i++){

     for(int j=0;j<sequence_size;j++){
       switch (alin[i][j]){

       case 'A' : case 'a':    // first row store the frequency of A
         stat_m(0,j)++;
         break;
       case 'C' : case 'c':    // second row store the frequency of C
         stat_m(1,j)++;
         break;
       case 'G' : case 'g':    // third row store the frequency of G
         stat_m(2,j)++;
         break;
       case 'T' : case 't':    // forth row store the frequency of G
         stat_m(3,j)++;
         break;
         default:
         stat_m(4,j)++;        // non identified base '-','N'...
         break;
       }

     }
   }
  return stat_m/N_sequence; 
 }

Note that:

alin contains aligned sequences

N_sequence is number of sequences

Question:

  • Witch condition I have to rich to select an SNP?
  • Is the alignment procedure (parameters) affects the target result?
SNP R alignment • 1.4k views
ADD COMMENT
3
Entering edit mode
8.1 years ago
ddiez ★ 2.0k

I am not sure whether this question is related to R (as stated in the tags)- the only apparent relation is that your statistics function is coded using Rcpp. For you first question, I am assuming you are asking for methods for SNP calling. If that is correct, this older question and related answers in this site might be relevant. For your second question the answer is yes, if you change the alignment parameters you may end up with a different alignment. If you end up with a different alignment then you might end up with different genomic positions called as SNPs. Disclaimer: I have no direct experience working with SNP data.

EDIT

In addition to my original response, and if you are working in R, I would take a look at the snpStats package.

ADD COMMENT

Login before adding your answer.

Traffic: 2580 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6