Help With Rps-Blast With New Blast
3
1
Entering edit mode
13.5 years ago
Ale ▴ 10

Hi

I want to make rpsblast with the new blast 2.2.25 with rpstblastn. But I need to format the database. I think for the older blast version you use formatrpsdb and it´s create others files including .aux, but for this new version I dont know what to use.

I tried this

makeblastdb -in Pfam-A.fasta -title Pfam -logfile pfam

But it just creates .psq .pin and .phr, and if I use rpstblastn it says that canot find the .aux file

I hope someone can help me!!! Thanks!

blast • 8.1k views
ADD COMMENT
0
Entering edit mode

I've found that the rpsblast binaries that come with the BLAST+ package core dump when I use -cpu > 1.

ADD REPLY
0
Entering edit mode

"Help with..." is not a question. It would make it easier for others to find your question and the answers if it is phrased as a question.

ADD REPLY
0
Entering edit mode

why do you prefer it to HMMER? (it's a real question)

ADD REPLY
1
Entering edit mode
12.8 years ago
Hamish ★ 3.3k

As Michael mentions the NCBI BLAST+ distribution does not currently (NCBI BLAST 2.2.25) contain a tool for creating RPS-BLAST databases. Checking in the NCBI BLAST documentation which covers "Legacy" BLAST usage an equivalent for 'formatrpsdb' is one of the programs which fall under: "Those programs have no blast+ counterpart at this time." Given that the "Legacy" NCBI BLAST has been deprecated in favour of NCBI BLAST+, my guess is that a replacement tool will appear in the next couple of releases.

In the meantime it looks like the only options for building databases for use with 'rpsblast' are:

  1. Use 'formatrpsdb' from the "Legacy" NCBI BLAST distribution.

  2. Use the old 'makemat'/'copymat'/'formatdb' method, which was replaced by 'formatrpsdb', but is described in the "Legacy" NCBI BLAST 'rpsblast' documentation.

If you are only interested in the databases which NCBI provide in their CD-search service, then there are a couple of additional options:

  1. Download the pre-formatted databases from the NCBI FTP site, see the README file for details of the available databases and the appropriate versions for your environment. As mentioned by Niek these files are also available from third parties.

  2. Use the CD-search web service to access the NCBI CD-search service remotely. This has the advantage of NCBI doing all the database and software maintenance.

Since RPS-BLAST is a method for searching a database of protein signatures (PSI-BLAST derived PSSM profiles in this case) with a sequence. It may be worth looking at alternative tools which provide sequence searches against protein signature databases. Given that your example used Pfam a couple of Hidden Markov Model (HMM) based methods come to mind:

  • HMMER which is used both during generation of Pfam and as their search tool.
  • HH-Suite, which features the HH-blits search tool which performs sequence searches against a HMM database derived from alignments.

There are many protein signature methods and associated databases of protein signatures, if you want to look into alternative methods and databases then a good place to start is InterPro, which integrates information from a number of protein signature databases. To search InterPro with a sequence use InterProScan, which is available on the web, via web services (both SOAP and REST) and as a software download.

Edit: 5-MAR-2012

With the release of NCBI BLAST+ 2.2.26 'makeprofiledb' has been introduced to replace 'formatrpsdb' from "Legacy" NCBI BLAST:

$ ./makeprofiledb -help
USAGE
  makeprofiledb [-h] [-help] -in in_pssm_list [-binary]
    [-title database_title] [-threshold word_score_threshold]
    [-out database_name] [-max_file_sz max_file_size_in_bytes]
    [-dbtype output_db_type] [-index create_index_files]
    [-gapopen gap_open_penalty] [-gapextend gap_extend_penalty]
    [-scale pssm_scale_factor] [-matrix matrix_name]
    [-obsr_threshold observations_threshold]
    [-exclude_invalid exclude_invalid] [-logfile File_Name] [-version]

DESCRIPTION
   Application to create databases for rpsblast, cobalt and deltablast,
   version 2.2.26+

REQUIRED ARGUMENTS
 -in <File_In>
   Input file that contains a list of smp files (delimited by space, tab or
   newline)

...

I suspect the NCBI BLAST News and the NCBI BLAST documentation will be updated shortly to reflect the new release.

ADD COMMENT
0
Entering edit mode
13.5 years ago

For now (at least it was with 2.2.24), there is no C++ version of formatrpsdb available. You need to use the one from the C branch.

ADD COMMENT
0
Entering edit mode
13.0 years ago
Niek De Klein ★ 2.6k

You could use Pfam_LE.tar.gz from http://www.biowebdb.org/pub/cdd/little_endian/, although it was made before the last update of Pfam so it might be out of date.

ADD COMMENT

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6