Pairwise sequence alignment with Biojava?
0
2
Entering edit mode
10.1 years ago
Bioaln ▴ 360

Hello. I've been desperately trying to make this basic example from the wiki Biojava cookbook work, but in vain. I am creating a java app so I downloaded all the necessary jars and placed them to the build path. Whenever I try to run this, the error message appears. I am a bit lost here.

package testextjars;

import org.biojava3.alignment.Alignments;
import org.biojava3.alignment.Alignments.PairwiseSequenceAlignerType;
import org.biojava3.alignment.SimpleGapPenalty;
import org.biojava3.alignment.SubstitutionMatrixHelper;
import org.biojava3.alignment.template.SequencePair;
import org.biojava3.alignment.template.SubstitutionMatrix;
import org.biojava3.core.sequence.DNASequence;
import org.biojava3.core.sequence.compound.AmbiguityDNACompoundSet;
import org.biojava3.core.sequence.compound.NucleotideCompound;

class seqaln{
public static void main(String[] args){
        String targetSeq = "CACGTTTCTTGTGGCAGCTTAAGTTTGAATGTCATTTCTTCAATGGGACGGA"+
                  "GCGGGTGCGGTTGCTGGAAAGATGCATCTATAACCAAGAGGAGTCCGTGCGCTTCGACAGC"+
              "GACGTGGGGGAGTACCGGGCGGTGACGGAGCTGGGGCGGCCTGATGCCGAGTACTGGAACA"+
              "GCCAGAAGGACCTCCTGGAGCAGAGGCGGGCCGCGGTGGACACCTACTGCAGACACAACTA"+ 
              "CGGGGTTGGTGAGAGCTTCACAGTGCAGCGGCGAG";
        DNASequence target = new DNASequence(targetSeq,
                AmbiguityDNACompoundSet.getDNACompoundSet());

        String querySeq = "ACGAGTGCGTGTTTTCCCGCCTGGTCCCCAGGCCCCCTTTCCGTCCTCAGGAA"+
              "GACAGAGGAGGAGCCCCTCGGGCTGCAGGTGGTGGGCGTTGCGGCGGCGGCCGGTTAAGGT"+
              "TCCCAGTGCCCGCACCCGGCCCACGGGAGCCCCGGACTGGCGGCGTCACTGTCAGTGTCTT"+
              "CTCAGGAGGCCGCCTGTGTGACTGGATCGTTCGTGTCCCCACAGCACGTTTCTTGGAGTAC"+
              "TCTACGTCTGAGTGTCATTTCTTCAATGGGACGGAGCGGGTGCGGTTCCTGGACAGATACT"+
              "TCCATAACCAGGAGGAGAACGTGCGCTTCGACAGCGACGTGGGGGAGTTCCGGGCGGTGAC"+
              "GGAGCTGGGGCGGCCTGATGCCGAGTACTGGAACAGCCAGAAGGACATCCTGGAAGACGAG"+
              "CGGGCCGCGGTGGACACCTACTGCAGACACAACTACGGGGTTGTGAGAGCTTCACCGTGCA"+ 
              "GCGGCGAGACGCACTCGT";
        DNASequence query = new DNASequence(querySeq,
                AmbiguityDNACompoundSet.getDNACompoundSet());

        SubstitutionMatrix<
        NucleotideCompound> matrix = SubstitutionMatrixHelper.getNuc4_4();

        SimpleGapPenalty gapP = new SimpleGapPenalty();
        gapP.setOpenPenalty((short)5);
        gapP.setExtensionPenalty((short)2);

        SequencePair<DNASequence, NucleotideCompound> psa =
                Alignments.getPairwiseAlignment(query, target,
                        PairwiseSequenceAlignerType.LOCAL, gapP, matrix);

        System.out.println(psa);
    }
}

Exception in thread "main" java.lang.ExceptionInInitializerError # -> WHAT IS THAT??
    at org.biojava3.alignment.SimpleAlignedSequence.setLocation(SimpleAlignedSequence.java:351)
    at org.biojava3.alignment.SimpleAlignedSequence.<init>(SimpleAlignedSequence.java:88)
    at org.biojava3.alignment.SimpleProfile.<init>(SimpleProfile.java:119)
    at org.biojava3.alignment.SimpleSequencePair.<init>(SimpleSequencePair.java:86)
    at org.biojava3.alignment.SmithWaterman.setProfile(SmithWaterman.java:71)
    at org.biojava3.alignment.template.AbstractMatrixAligner.align(AbstractMatrixAligner.java:344)
    at org.biojava3.alignment.template.AbstractPairwiseSequenceAligner.getPair(AbstractPairwiseSequenceAligner.java:112)
    at org.biojava3.alignment.Alignments.getPairwiseAlignment(Alignments.java:208)
    at testextjars.seqaln.main(seqaln.java:43)
Caused by: java.lang.NullPointerException
    at java.util.Collections$UnmodifiableCollection.<init>(Collections.java:1026)
    at java.util.Collections$UnmodifiableList.<init>(Collections.java:1302)
    at java.util.Collections.unmodifiableList(Collections.java:1287)
    at org.biojava3.core.sequence.location.template.AbstractLocation.<init>(AbstractLocation.java:111)
    at org.biojava3.core.sequence.location.template.AbstractLocation.<init>(AbstractLocation.java:85)
    at org.biojava3.core.sequence.location.SimpleLocation.<init>(SimpleLocation.java:57)
    at org.biojava3.core.sequence.location.SimpleLocation.<init>(SimpleLocation.java:53)
    at org.biojava3.core.sequence.location.template.Location.<clinit>(Location.java:48)
    ... 9 more
sequence-alignment biojava java • 6.0k views
ADD COMMENT
0
Entering edit mode

I've just tried your code with the latest biojava snapshot from github and it worked for me. What version of biojava are you using? how did you get the jar files, using maven or manually?

Note that current release is 3.1, these are the jars you can get with maven or manually from http://biojava.org/wiki/BioJava:Download. At the moment we are working in next release 4.0 which has a lot of improvements (I tried the code above with the current development snapshot of 4.0). Hopefully it will be released by the end of the year.

ADD REPLY
0
Entering edit mode

Thanks for this thorough reply. As I didn't quite understand how to import things in maven (project is maven-based), I decided to manually add the jars necessary for this app to run (stated on the BioJava site). If I understood you correctly, one doesn't need any external libs, adding repository code to .pom file is enough? Would you be so kind as to tell me where can I find the tutorial or something to set this up, this is my first time using maven, I didn't catch the hang of it quite yet. Which dependencies did you add? (I thought forester.jar wasn't one of them.)

Thanks for any reply.

ADD REPLY
0
Entering edit mode

For maven you just need to follow the instructions in the readme (https://github.com/biojava/biojava), i.e. adding this code to your pom.xml file:

    <repositories>
      <repository>
        <id>biojava-maven-repo</id>
        <name>BioJava repository</name>
        <url>http://www.biojava.org/download/maven/</url>           
      </repository>
    </repositories>

    <dependencies>
      <dependency>
        <groupId>org.biojava</groupId>
        <artifactId>biojava3-core</artifactId>
        <version>3.1.0</version>
      </dependency>
    </dependencies>

In there you might need artifacts (i.e. jars) not only for core but for biojava3-alignment, like this:

<dependency>

        <groupId>org.biojava</groupId>
        <artifactId>biojava3-alignment</artifactId>
        <version>3.1.0</version>
      </dependency>

Maven will then take care of other dependencies of biojava automatically.

ADD REPLY
0
Entering edit mode

I did as you said, it's frustrating because the core lilbrary works perfectly (importing UNIPROTs for example), but this snipplet still doesn't seem to work. It still gives me the nullpointer exception error. Are you positive the biojava3-core and biojava3-alignment are the only things one most import in order for this to work? How do you have your maven .pom properties and build configured? (sorry for this question, but it seems like I've tried everything :()

ADD REPLY
0
Entering edit mode

Can you post your pom.xml file?

ADD REPLY
0
Entering edit mode

I uploaded it one answer below (as the second answer)

ADD REPLY
0
Entering edit mode

Hello Jose,

Just wondering, are you going to add biojava to the Maven Central Repository?

ADD REPLY
1
Entering edit mode

Yes, the plan is that release 4.0 will be in maven central. See https://github.com/biojava/biojava/issues/130

ADD REPLY
0
Entering edit mode
<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/xsd/maven-4.0.0.xsd">
  <modelVersion>4.0.0</modelVersion>
  <groupId>testproject</groupId>
  <artifactId>testproject</artifactId>
  <version>0.0.1-SNAPSHOT</version>
  <build>
    <sourceDirectory>src</sourceDirectory>
    <plugins>
      <plugin>
        <artifactId>maven-compiler-plugin</artifactId>
        <version>3.1</version>
        <configuration>
          <source>1.8</source>
          <target>1.8</target>
        </configuration>
      </plugin>
    </plugins>
  </build>

  <repositories>
      <repository>
        <id>biojava-maven-repo</id>
        <name>BioJava repository</name>
        <url>http://www.biojava.org/download/maven/</url>
      </repository>
    </repositories>

    <dependencies>
      <dependency>
        <groupId>org.biojava</groupId>
        <artifactId>biojava3-core</artifactId>
        <version>3.1.0</version>
      </dependency>

      <dependency>

        <groupId>org.biojava</groupId>
        <artifactId>biojava3-alignment</artifactId>
        <version>3.1.0</version>
      </dependency>

    </dependencies>

</project>
ADD REPLY
0
Entering edit mode

I've just copy/pasted your pom into a new maven project in eclipse and then copy/pasted your code above and it all worked perfectly well. What eclipse are you using? Do you have m2e installed?

ADD REPLY
0
Entering edit mode

Actually I've just seen this in the mailing list: http://mailman.open-bio.org/pipermail/biojava-l/2014-November/011339.html which looks very much like your error and then explains everything: the problem is with Java 8 (I tried with Java 7)

ADD REPLY
0
Entering edit mode

Darn. Thanks for the effort, will try jdk 7.

ADD REPLY
0
Entering edit mode

I've tried now with Java 8 and I can reproduce the problem when using BioJava 3.1. However using the current BioJava 4.0.0-SNAPSHOT seems to work fine.

ADD REPLY
0
Entering edit mode

Alright, one last thing, correct me if I am wrong. To use the BioJava 4.0.0, all I have to do is change the .pom file in maven?

ADD REPLY
0
Entering edit mode

Nope, 4.0 is not released yet. So you either wait until the release or get the sources from github which then will allow you to build locally 4.0.0-SNAPSHOT. Then you can use 4.0.0-SNAPSHOT in your pom

ADD REPLY

Login before adding your answer.

Traffic: 1964 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6