Question

Genscan Gene Prediction

1

Entering edit mode

14.4 years ago

Gvj ▴ 470

Dear all, I am trying to use genscan with multifasta file. I am getting this error:

scoring sequence... Error : calloc failed on CUM_SCORE array

with simple fasta file its working fine. I have found one link which also asking about the same thing. have you faced this problem of limitation with 2MB? how did you solve it?

genome annotation error • 14k views

ADD COMMENT • link updated 11.0 years ago by Biostar 20 • written 14.4 years ago by Gvj ▴ 470

Ram · Answer 1 · 2010-11-22

5

Entering edit mode

14.4 years ago

Darked89 4.7k

Genscan does not work well with multifasta. There is a Perl script by Brian Osborne run_genscan.pl which helps. Let me know if you can not get it from BioPerl/Google cache.

I am not sure about the Genscan limits of individual single fasta entries.

ADD COMMENT • link 14.4 years ago by Darked89 4.7k

1

Entering edit mode

for i in {2..x}; do awk -v a="$i" 'BEGIN{RS=">"; tem="tmp"} NR==a{print a"n";print ">"$0 >tem; exit}' genome.fas ;  genscan your.smat tmp >>genscan_sh.out; rm tmp; done

dirty version of my script to run it.

note: 'x' in for loop is the number of fasta entry +1 . I

ADD REPLY • link updated 5.6 years ago by Ram 45k • written 14.4 years ago by Gvj ▴ 470

0

Entering edit mode

Did you succeed to convert the Genscan output format to GFF and GTF ? I have found few on net but non of them is working. IF you have a parser for Fgenesh, please share it also

ADD REPLY • link 14.4 years ago by Gvj ▴ 470

score 1 · Answer 2 · 2013-05-20

I also meet this problem. It's sure that GENSCAN can't handle multifasta to predict gene. Therefore, multifasta should be split into single fasta. In addition, I try to short the length of input sequence. So far, the GENSCAN works well with parameter file of HumanIso.smat if the length of the input fasta is 5999940.

Hope this information is useful!!