Question

Thesis Help: Dna Sequence using BLAST

0

Entering edit mode

9.2 years ago

ismailkiron5 • 0

Hello,

I am doing my undergrad research on an android application which will fetch scientific name and it's sequence with score using BLAST. we already implemented BLAST and needleman wunsch for Global and local alignment. But We need add some improved module from existing module. Can anyone suggest anything?

sequence blast • 2.8k views

ADD COMMENT • link updated 2.2 years ago by Ram 44k • written 9.2 years ago by ismailkiron5 • 0

0

Entering edit mode

Not sure if I understand you correctly. Do you need a module able to run remote BLAST? Which is the programming language you are using?

ADD REPLY • link 9.2 years ago by samuelmiver ▴ 440

0

Entering edit mode

I guess if you're using Android you could just use BioJava.

ADD REPLY • link 9.2 years ago by pld 5.1k

0

Entering edit mode

Here is the implementation:

http://biojava.org/wiki/BioJava:CookBook3:NCBIQBlastService

ADD REPLY • link 9.2 years ago by samuelmiver ▴ 440

0

Entering edit mode

I actually did the implementation from this

https://code.google.com/p/meta-proteome-analyzer/source/browse/branches/development/src/de/mpa/client/blast/RunMultiBlast.java?spec=svn769&r=769

ADD REPLY • link updated 2.2 years ago by Ram 44k • written 9.2 years ago by ismailkiron5 • 0

0

Entering edit mode

I already did the implementation. The problem is I need show improved application from the existing one. It can be by adding new feature or new method.

ADD REPLY • link updated 2.2 years ago by Ram 44k • written 9.2 years ago by ismailkiron5 • 0

2

Entering edit mode

One of the critical things that can add minutes and minutes to a BLAST process is the word size. In order to find matches quickly, BLAST looks for perfect matches across subsequences of the length of this word size. Therefore, if the word size is very small, the search takes longer. If the search size becomes large, imperfect matches (e.g. with a single nucleotide insertion) are no longer found.

Depending in what you need performance vs. quality you can play with the word size to achieve a equilibrium between both of them.

ADD REPLY • link 9.2 years ago by samuelmiver ▴ 440

0

Entering edit mode

Thank you so much for the reply. What are the things needed for adding minutes?

ADD REPLY • link updated 2.2 years ago by Ram 44k • written 9.2 years ago by ismailkiron5 • 0

1

Entering edit mode

I haven't worked with BLAST in Java but it is very universal the use of -W to set the word size. The default is 11 for blastn, 28 for megablast and 3 for all others. In case you want to improve the time I will try increasing a little bit those values.

This will reduce the performance time but could introduce artefactual results so I recommend you to take some sequences you previously know the correct result and test how much you can increase the word size without having errors.

ADD REPLY • link 9.2 years ago by samuelmiver ▴ 440

0

Entering edit mode

Is there already a sequence alignment tool for android?

ADD REPLY • link 9.2 years ago by pld 5.1k

0

Entering edit mode

no we are using the Java version in android as a tool

https://code.google.com/p/meta-proteome-analyzer/source/browse/branches/development/src/de/mpa/client/blast/RunMultiBlast.java?spec=svn769&r=769

ADD REPLY • link updated 5.0 years ago by Ram 44k • written 9.2 years ago by ismailkiron5 • 0

score 3 · Accepted Answer · 2015-09-15

Just my previous comments together in an answer:

One of the critical things that can add minutes and minutes to a BLAST process is the word size. In order to find matches quickly, BLAST looks for perfect matches across subsequences of the length of this word size. Therefore, if the word size is very small, the search takes longer. If the search size becomes large, imperfect matches (e.g. with a single nucleotide insertion) are no longer found.

Depending in what you need performance vs. quality you can play with the word size to achieve a equilibrium between both of them.

I haven't worked with BLAST in Java but it is very universal the use of -W to set the word size. The default is 11 for blastn, 28 for megablast and 3 for all others. In case you want to improve the time I will try increasing a little bit those values.

This will reduce the performance time but could introduce artefactual results so I recommend you to take some sequences you previously know the correct result and test how much you can increase the word size without having errors.