Entering edit mode
9.8 years ago
dago
★
2.8k
I get the following error when running a perl program:
Use of uninitialized value $Bio::DB::NCBIHelper::HOSTBASE in concatenation (.) or string at /usr/share/perl5/Bio/DB/Query/GenBank.pm line 103.
Use of uninitialized value $Bio::DB::NCBIHelper::HOSTBASE in concatenation (.) or string at /usr/share/perl5/Bio/DB/Query/GenBank.pm line 104.
outDir: Test1/
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: The sequence does not appear to be FASTA format (lacks a descriptor line '>')
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:486
STACK: Bio::SeqIO::fasta::next_seq /usr/share/perl5/Bio/SeqIO/fasta.pm:136
STACK: Guidance::name2codeFastaFrom1 /usr/local/lib/guidance.v1.5/www/Guidance/Guidance.pm:1220
STACK: /usr/local/lib/guidance.v1.5/www/Guidance/guidance.pl:445
However, I am quite sure that all my seq are fasta. Here an example:
cat 2312_Ad2_02358.faa -A
>Ad2_02358 Chaperone protein ClpB$
MDFEKYTERARGFIQSAQTYALGQGHQQFTPAHILKVLLDDSEGMSAGLIERAGGRAQDVRLQIETDLAALPKVSGGNGQLYLSPEIARLFEQAEKIAEKAGDSYVTVERLLLALALDKGSQAGKALAQGGVTPSGLNEAINGLRKGRTADSASAENQYDALKKFAQDLTQAARDGKLDPVIGRDEEIRRAIQVLSRRTKNNPVLIGEPGVGKTAIAEGL
What I am missing here?
EDIT
Here is the file with the seqs
Difficult to say what you are missing without seeing the complete file - the file itself, not a copy/paste here.
However, clearly you are missing something :) You may be "quite sure" but the fasta parser is equally sure that at least one sequence is invalid - and in my experience, the parser is generally correct. Convincing yourself that you know better than the error message is a common mistake and it will not lead to solutions.
Agree with you. I added a link to the file containing the seqs, maybe I am missing something there.
If your file comes from a Windows machine, you might use
dos2unix
on your file to strip any extraneous Windows carriage return characters, which can interfere with parsing on UNIX platforms.If all your headers have numbers you can check for missing
>
in the header by executing:or a cmd pipe equivalent, but without the actual code and an example it's hard to say.
@Alex Reynolds thanks, but all my files come from unix. @mxs The file reported above contains only 6 sequences and I manually checked them. There is always a
>
at the starting of the seq.Have you tried removing (replacing with underscore) blanks from the header? Otherwise I see no obvious "mistake".
Thanks! I tried, but same problem. The program I am using is creating a folder with the results. If a conflict with the folder name is created (e.g. same outdir names) the program crashes.
Could you maybe explain this dirname conflict a bit please?
Sure. I use guidance.pl and it asks me for an ouDir name. If the dir name is the same as an existing one I get the error, if not it runs correctly.
Guidance looks like a really complicated script+package. I'll run a local check on
next_seq
with your file. If it works, there's something wrong with either how guidance passes parameters or how you're using the tool. In the meantime, could you also update the question with the exact command you're running please? Thank you!EDIT: I ran a simple Bio::Seq script on it and it works fine. We're probably looking at an error in usage or an untested anomaly in the guidance package.
This is the code that finally worked
However, if I run the following it runs the firs seq and if gives me the error:
There's nothing in
$1
,$i
is the loop variable.Sorry there was a typo.
Also,
It works, but if I try to run it again with the outDir Gui already there it gives me the error above.
That's strange, especially considering how guidance is existing folder tolerant from the brief glance I gave to the code. What is the output of:
I agree that guidance is probably the issue. I also tried a simple Bioperl script, no errors with your file.
Someone once told me it's better to use
use warnings;
instead ofperl -w
.Ref: How to copy all fasta-seqs from fasta-files with the seq-lengths between minlen and maxlen
Some use
use warnings FATAL => 'all';
to make the script die on warnings. Seems like a good defensive approach.The script shouts KMN if it gets a papercut :)