Dear all,
I'm going to do homology search using short sequences (35 bp) as a query against my own database, which is my genes sequence of interest in usual length (not short). As far as I know using of usual blast is not suitable in my case as the its heuristic nature and small word size (that I have to choose in my case). Could please someone introduce the appropriate tool to this end? Thanks a lot for sharing your similar experience.
Thanks for your reply. it sounds great for this case, but unfortunately it is just for protein similarity search. I need something like this tool for nucleotide similarity search.
I haven't used it myself so can't be sure... But I think it can do nucleotides as well, see here. I would check out the source code as well.
Dear, FASTA provides a heuristic search with a nucleotide query, may be like blast, however I don't work with it.
Quoting: Optimal searches are available with SSEARCH (local), GGSEARCH (global) and GLSEARCH (global query, local database).
Thanks a lot. I try it, but it allow us to submit just one sequence per each job. It's impossible for many works, actually each my file is composed of 100-200 short sequence. Has anyone experience on blast of batch file using GLSEARCH?, as dariober has mentioned it's so nice to global query and local database.
Have a look at the standalone version packaged here together with a bunch of other programs.
The FASTA suite also includes a set of sequence fragment search methods, which are optimised to handle the issues with short vs. long sequences. For the nucleotide case you will want to look at FASTM. For more info see PubMed:12096132, the FASTM service at EMBL-EBI, the FASTA guide and the documentation included in the FASTA suite distribution.