Dear all,
I am trying to use genscan with multifasta file. I am getting this error:
scoring sequence...
Error : calloc failed on CUM_SCORE array
with simple fasta file its working fine. I have found one link which also asking about the same thing. have you faced this problem of limitation with 2MB? how did you solve it?
Genscan does not work well with multifasta. There is a Perl script by Brian Osborne run_genscan.pl which helps. Let me know if you can not get it from BioPerl/Google cache.
I am not sure about the Genscan limits of individual single fasta entries.
for i in {2..x}; do awk -v a="$i" 'BEGIN{RS=">"; tem="tmp"} NR==a{print a"n";print ">"$0 >tem; exit}' genome.fas ; genscan your.smat tmp >>genscan_sh.out; rm tmp; done
dirty version of my script to run it.
note: 'x' in for loop is the number of fasta entry +1 . I
ADD REPLY
• link
updated 5.2 years ago by
Ram
44k
•
written 14.0 years ago by
Gvj
▴
470
0
Entering edit mode
Did you succeed to convert the Genscan output format to GFF and GTF ? I have found few on net but non of them is working.
IF you have a parser for Fgenesh, please share it also
I also meet this problem. It's sure that GENSCAN can't handle multifasta to predict gene. Therefore, multifasta should be split into single fasta. In addition, I try to short the length of input sequence. So far, the GENSCAN works well with parameter file of HumanIso.smat if the length of the input fasta is 5999940.
dirty version of my script to run it.
note: 'x' in for loop is the number of fasta entry +1 . I
Did you succeed to convert the Genscan output format to GFF and GTF ? I have found few on net but non of them is working. IF you have a parser for Fgenesh, please share it also