makeblastdb.exe giving me an unclear error message when creating a database
1
1
Entering edit mode
7.2 years ago
DNAngel ▴ 250

I have a large sequence file that I want to convert into a database where I can blast other sequences against it. I've done this many times before with smaller file sizes, however this one is giving me an unclear error message:

Building a new DB, current time: 10/23/2017 14:17:46
New DB name:   ~\blast\db\mydatabase
New DB title:  ~\blast\myseqs.fa
Sequence type: Nucleotide
Keep Linkouts: T
Keep MBits: T
Maximum file size: 1000000000B

volume: ~\blast\db\mydatabase

file: ~\blast\db\mydatabase.nin
file: ~\blast\db\mydatabase.nhr
file: ~\blast\db\mydatabase.nsq

BLAST Database creation error: Need to write conversion for data type [0].

Note: I do not have missing residues (no empty lines), my sequences do have gaps with "-" representing gaps. I thought maybe that was the problem, but when I take say the first 10 sequences (keeping the gaps) from the same file, it converts easily into a database. So I thought maybe it was the size of the file (it is about 47400kb) so I broke the file up into 3 smaller files. Only the second file out of the three converted successfully into a nucleotide database, but the other 2 did not (note: they were all the same size and nothing was different about the sequences).

Here is the very simple command I used and have always used before with no issue:

makeblastdb.exe -in myseqs.fa -dbtype nucl -out mydatabase

I've contacted the support group for standalone blast on NCBI, but they have not responded at all to me, nor could I find any other instances of that error message on Google. I'm stumped.

blast • 2.5k views
ADD COMMENT
0
Entering edit mode

You are using a single - to represent gaps of any length, correct?

ADD REPLY
0
Entering edit mode

each '-' represents 1 gap in the sequence, so one hypen = one base.

ADD REPLY
0
0
Entering edit mode

I tried your suggestion for checking weird characters but I keep getting another error. Perhaps this is where the issue is? Although I don't understand the error (I am not great with grep/linux commands).

It says:

Input record exceeds maximum length. Specify larger maximum.

grep: write error: Illegal seek grep: write error: Invalid or incomplete multibyte or wide character

ADD REPLY
0
Entering edit mode

what is the output of

file input.fa

must be something like 'ASCII text'

ADD REPLY
0
Entering edit mode

It says: input.fa: ASCII text, with very long lines

ADD REPLY
0
Entering edit mode

This sounds like an issue related to sort on windows. Do you have access to a unix machine? Otherwise you could try wrapping the long fasta lines.

ADD REPLY

Login before adding your answer.

Traffic: 810 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6