Ncbi Legacy Blast Usage With Tblastn/Pssm
2
3
Entering edit mode
14.3 years ago

I'm trying to get a webservice for protein discovery running. I would like to perform a tblastn with a PSSM from NCBI's archive (smp file). This works fine with NCBI BLAST+, but unfortunately the framework I should run it from only supports the old NCBI BLAST (2.2.21).

So I'm searching for a equivalent command to

tblastn -in_pssm matrix.smp -db database -evalue 1e-10 -out outfile -outfmt 6

and what I came up with was

blastall -p psitblastn -d database -R matrix.smp -o outfile -e 1e-10 -m 8

This command, however, has been running for hours without producing any output, error message, or consuming any cpu time (ps -A | grep blastall yields 0:00:00)

What am I doing wrong?

blast ncbi pssm • 5.2k views
ADD COMMENT
1
Entering edit mode
14.3 years ago
Science_Robot ★ 1.1k

the query is specified with -i <queryfile>. The program is hanging idle because it's waiting for an input from STDIN.

EDIT: Not sure if this is the answer I do not know what your web-service requires. Are you providing a query using the web-service?

EDIT EDIT: Definition of PSSM from NCBI

A PSSM, or Position-Specific Scoring Matrix, is a type of scoring matrix used in protein BLAST searches in which amino acid substitution scores are given separately for each position in a protein multiple sequence alignment. Thus, a Tyr-Trp substitution at position A of an alignment may receive a very different score than the same substitution at position B. This is in contrast to position-independent matrices such as the PAM and BLOSUM matrices, in which the Tyr-Trp substitution receives the same score no matter at what position it occurs.

The PSSM is just a scoring matrix to be used in conjunction with a query.

ADD COMMENT
1
Entering edit mode

Isn't the input sequence somewhat irrelevant when I already have a PSSM to search with?- however, I'll try supplying the sequence as well and see if it works. Thanks!

ADD REPLY
0
Entering edit mode

As I recall, the old NCBI blastall binary did not support searching with a pssm. To do that, I believe you need to use the separate blastpgp binary that should also be part of the distribution.

ADD REPLY
0
Entering edit mode

To your edit2: as far as I understand, the values in a PSSM at each position are enough to define substitutions. If, eg., a Trp is at position X that is highly conserved, the matrix values will assign a high score to Trp and a low to all others (without needing to know that there was indeed a Trp in a large subset of sequences). Also, the concept of one input sequence for a profile generated from multiple homologues seems a bit shaky. Then again, I might be wrong ;)

ADD REPLY
0
Entering edit mode

Sorry for the late accept: the program really waits for stdin, but I still think that in theory it should not be necessary.

ADD REPLY
1
Entering edit mode
14.3 years ago

As I recall, the old NCBI blastall binary did not support searching with a pssm. To do that, I believe you need to use the separate blastpgp binary that should also be part of the distribution.

ADD COMMENT
0
Entering edit mode

According to http://www.csc.fi/english/research/sciences/bioscience/programs/blast/blastall, Documentation for PSI-TBLASTN, it should work with blastall.

ADD REPLY
0
Entering edit mode

Ok, in that case I cannot help.

ADD REPLY

Login before adding your answer.

Traffic: 2060 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6