I want to use GeneMark (2.7d) to do gene prediction in a soup of sequences. I don't want, however, to use the general bacterial/archaeal model. Instead, I want to create a custom model file. While this is easy in glimmer (just select a bunch of long orfs, for example), I cannot find out how to do it using GeneMark (it's probably the 'probuild' program but don't know anything else).
Edit: I downloaded GeneMark from GeneMarkS - Linux64. After expanding the zipped file, there's a program called 'probuild' that you use for building custom models but the documentation is really poor (or "hidden" somewhere I cannot easily find!). The contents of one of the prebuilt model files are like this:
PHMM 2.5
NAME Aeropyrum_pernix
ORDM 2
ATG_ 0.298
GTG_ 0.279
TTG_ 0.423
CTG_ 0
TAA_ 1
TAG_ 1
TGA_ 1
MINC 40
MAXC 12000
MAXN 12000
NDEC 150
CDEC 300
CDCD 0.0
CD1P 1
CD2P 1
COD1
0.00780 0.00540 0.00895
...Lots of numbers.....
COD2
...Lots of numbers.....
NONC
...Lots of numbers.....
RBSM
0.132 0.167 0.431 0.270
...Some more numbers...
RBSL 34
RBSD
0.016 0.008 0.024 0.032 0.12 0.174 0.128 0.086 0.094 0.08 0.03 0.022 0.012 0.012 0.012 0.006 0.002 0.006 0.01 0.01 0.008 0.012 0.004 0.008 0.008 0.008 0.002 0.012 0.002 0.012 0.004 0.002 0.01 0.01
can you post a link to the program and maybe an example of the file you want to generate?