consensus sequence from MSA using seqAn
2
1
Entering edit mode
10.2 years ago

I'm trying to create a consensus sequencing using seqAn. I've followed the nice guide to create a multiple sequence alignment:

http://seqan.readthedocs.org/en/latest/Tutorial/MultipleSequenceAlignment.html

I am looking for an explicit example of how turn the alignment into a consensus. I am also having trouble using the "alignmentEvaluation()" function. I made the assumptions that gapCount and other parameters are ints, but compilation doesn't agree.

Here is my working code so far:

  {
    using namespace seqan;

    typedef String< Dna > TSequence;
    StringSet<TSequence> seq;

    for(vector<string>::iterator seqs = s.begin();
        seqs != s.end(); seqs++
        ){
      appendValue(seq, *seqs);
    }
    Graph<Alignment<StringSet<TSequence, Dependent<> > > > aliG(seq);
    globalMsaAlignment(aliG, Blosum62(-1, -11));
    cerr << aliG << endl;

  }

dave.weese, any pointers or code snippets would be appreciated.

cpp seqAn • 3.3k views
ADD COMMENT
3
Entering edit mode
10.2 years ago
Manuel ▴ 410

Hi Zev, sorry for taking so long to answer.

Today, I had the time to flesh out the MSA tutorial and API documentation of the globalMsaAlignment() function more. Note that the links below go to the documentation of the develop branch but the functionality for the MSA is the same in the master branch and the 1.4.2 release:

I updated the tutorial to perform a MSA directly into an Align object and added a second section that shows how to use ProfileChar for first a profile and then a consensus computation. Please indicate any unclear points so we can further improve the documentation.

Please do not hesitate to report more problems with the documentation, no matter whether it's bugs or unclear information. You will get the most responsive answers on our mailing list.

HTH,
Manuel

ADD COMMENT
1
Entering edit mode

This is ideal software support. Thank you kindly. SeqAn is awesome!

ADD REPLY
1
Entering edit mode
10.2 years ago

This is an ugly solution, but it worked:

Creating the alignment:

   typedef String< Dna > TSequence;
    typedef Graph<Alignment<StringSet<TSequence, Dependent<> > > > TGraph;

    StringSet<TSequence> seq;

    for(vector<string>::iterator seqs = s.begin();
    seqs != s.end(); seqs++
        ){
      appendValue(seq, *seqs);
    }
    TGraph aliG(seq);
    globalMsaAlignment(aliG, Blosum62(-1, -11));

Looping over the alignment to create the consensus:

  String<char> align;
    convertAlignment(aliG, align);

    unsigned int nseq   = length(seq);
    unsigned int colLen = length(align) / nseq;

    stringstream con;

    //    cerr << nseq << "\t" << colLen << endl;

    for(unsigned int z = 0 ; z < colLen; z++){
      map<char, int> columnBases;
      for(unsigned int s = 0; s < nseq; s++){
        //      cerr << align[z + (s*colLen)] ;
        if(align[z + (s*colLen)] != gapValue<char>()){
          columnBases[align[z + (s*colLen)]]++;
        }
      }
      if(columnBases.size() == 1){

   con << columnBases.begin()->first;

      }
      else{
        con << "N";
      }
      // cerr << endl;
    }
ADD COMMENT

Login before adding your answer.

Traffic: 1865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6