I have tried running a command line blast. The query file is a multi fasta file containing 2600 sequences. It was made a BLASTX against a proteins sequences (ProDom) of size 2 GB (prodom.phr : 1:00 GB, prodom.pin : 46.6 MB, prodom.psq : 2.01 GB). The first 142 sequences are getting the results, after that I am getting an error. The command that I have given and the error that I am getting are given below:
F:\blast\bin>blastx -db db/prodom -query in/seqs.txt -out out/seqs-prodom-blastx-20130811-e-1e-003.txt -evalue 1e-003
BLAST Database error: CSeqDBAtlas::MapMmap: While mapping file [F:\blast\bin\db\prodom.psq] with 602599160 bytes allocated, caught exception:
NCBI C++ Exception:
"..\..\..\..\..\src\corelib\ncbifile.cpp", line 4572: Error: ncbi::CMemoryFileSegment::CMemoryFileSegment() - File offset may not be negative
I have tried repeating the analysis with the 142 sequences removed, but still it's throwing the same error. Please let me know where am I going wrong or should I change any settings.
Could you paste sequences 140-145? Have you tried changing the order of sequences?
Have you tried re-building the ProDom db?
I have tried alternating the sequences, removing the first 143 sequences (as I am getting error after the 142 sequence), and also tried rebuilding the database, still it's the same mistake. I have corresponded with NCBI regarding this, they asked me for the input file and the sequences from which the database was created. I sent them three days back and am waiting for a response. Here is the link to all the sequence that I have used for the analysis: https://dl.dropboxusercontent.com/u/27959322/BI/seqs.txt
Best is to be patient, 3 days back is friday with a weekend in between that is no time at all. I guess that even NCBI has weekends.
What is the version of blast+? Please run blastdbcheck on the database.