Question

Biojava HMM error

0

Entering edit mode

9.3 years ago

Jamand ▴ 110

Hello,

Biojava is a framework bugged also in the most simple functions and it is very hard to think to use it for solving complex questions.

I've recently used it to make an alignment using HMM and I try to execute the example in the cookbook

Here is the example

ProfileHMM hmm = new ProfileHMM(DNATools.getDNA(),
                         12,
                         DistributionFactory.DEFAULT,
                         DistributionFactory.DEFAULT,
                         "my profilehmm");

    //create the Dynamic Programming matrix for the model.
    dp = DPFactory.DEFAULT.createDP(hmm);

    //Database to hold the training set
    SequenceDB db = new HashSequenceDB();

    //code here to load the training set

Now initialize all of the model parameters to a uniform value. Alternatively parameters could be set randomly or set to represent a guess at what the best model might be. Then use the Baum-Welch Algorithm to optimise the parameters.

    //train the model to have uniform parameters
    ModelTrainer mt = new SimpleModelTrainer();
    //register the model to train
    mt.registerModel(hmm);
    //as no other counts are being used the null weight will cause everything to be uniform
    mt.setNullModelWeight(1.0);
    mt.train();

    //create a BW trainer for the dp matrix generated from the HMM
    BaumWelchTrainer bwt = new BaumWelchTrainer(dp);

    //anonymous implementation of the stopping criteria interface to stop after 20 iterations
    StoppingCriteria stopper = new StoppingCriteria(){
      public boolean isTrainingComplete(TrainingAlgorithm ta){
        return (ta.getCycle() > 20);
      }
    };

    /*
     * optimize the dp matrix to reflect the training set in db using a null model
     * weight of 1.0 and the Stopping criteria defined above.
     */
    bwt.train(db,1.0,stopper);

Below is an example of scoring a sequence and outputting the state path.

    SymbolList test = null;
    //code here to initialize the test sequence

    /*
     * put the test sequence in an array, an array is used because for pairwise
     * alignments using an HMM there would need to be two SymbolLists in the 
     * array
     */

    SymbolList[] sla = {test};

    //decode the most likely state path and produce an 'odds' score
    StatePath path = dp.viterbi(sla, ScoreType.ODDS);
    System.out.println("Log Odds = "+path.getScore());

    //print state path
    for(int I = 1; I <= path.length(); i++){
      System.out.println(path.symbolAt(StatePath.STATES, i).getName());
    }

Everything seemed to go right up to the following line when I got an Exception(and it is not the first time since I use biojava):

StatePath path = dp.viterbi(sla, ScoreType.ODDS);</pre>

java.lang.ClassCastException: org.biojava.bio.seq.impl.SimpleSequence cannot be cast to java.lang.String
    at org.biojava.bio.alignment.SimpleAlignment.<init>(SimpleAlignment.java:214)
    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:671)
    at org.biojava.bio.dp.onehead.SingleDP.viterbi(SingleDP.java:512)
    at it.multimedia.hmm.TestHMM.main(TestHMM.java:149)

I pass as parameter a file with fasta sequences

Could you help me to solve this problem?

If someone who wrote Biojava should read this post I'd like to ask him how can be published a so "bugged" framework of which not even can be executed correctly examples reported in the official documentation.

Thank you very much

software-error biojava • 1.9k views

ADD COMMENT • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by Jamand ▴ 110

0

Entering edit mode

You could consider opening this as an issue on their GitHub. Sorry I can't address the error itself.

ADD REPLY • link updated 2.1 years ago by Ram 44k • written 9.3 years ago by eric.kern13 ▴ 240