New to Glimmer: trouble with .icm file
0
0
Entering edit mode
6.8 years ago
oars ▴ 200

I'm brand new to glimmer and gene assembly and prediction. I'm having a hiccup with the "build the model" step. I write the following command:

$ build-icm –r glm_ref.icm < glm_ref.long.seq

At this point I think I'm on track. My understanding is the model information is saved to a binary file named glm_ref.icm. Given this model, I can now use glimmer to predict genes in the reference genome.

USAGE:  build-icm [options] output_file < input-file

Read sequences from standard input and output to  output-file
the interpolated context model built from them.
Input also can be piped into the program, e.g.,
  cat abc.in | build-icm xyz.icm
If <output-file> is "-", then output goes to standard output

Options:
 -d <num>
    Set depth of model to <num>
 -F
    Ignore input strings with in-frame stop codons
 -h
    Print this message
 -p <num>
    Set period of model to <num>
 -r
    Use the reverse of input strings to build the model
 -t
    Output model as text (for debugging only)
 -v <num>
    Set verbose level; higher is more diagnostic printouts
 -w <num>
    Set length of model window to <num>

Next, I write the following command:

$ glimmer3 EC.fasta glm_ref.icm glm_ref

But then I keep receiving the following error:

Starting at Tue Feb  6 19:09:00 2018

ERROR:  Could not open file  glm_ref.icm
  errno = 2

I've looked in my directory and do not see a glm_ref.icm file (should I)? I've also looked in my glimmer folder and although I see a folder named ICM,, there is no specific file named glm_ref.icm?

glimmer gene prediction • 2.5k views
ADD COMMENT
0
Entering edit mode

I'm not sure what went wrong but I repeated my steps on my mac (I was using a linux machine before), and it worked. If anyone else that's new to glimmer has a similar issue, the $ glimmer3 EC.fasta glm_ref.icm glm_ref command should produce an .icm file in your directory of choice. If successful, you'll get output similar to the following:

Sequence file = EC.fasta
Number of sequences = 1
ICM model file = glm_ref.icm
Excluded regions file = none
List of orfs file = none
Input is NOT separate orfs
Independent (non-coding) scores are used
Circular genome = true
Truncated orfs = false
Minimum gene length = 100 bp
Maximum overlap bases = 30
Threshold score = 30
Use first start codon = false
Start codons = atg,gtg,ttg
Start probs = 0.600,0.300,0.100
Stop codons = taa,tag,tga
GC percentage = 50.8%
Ignore score on orfs longer than 750
Analyzing Sequence #1
Start Find_Orfs
Start Score_Orfs
Start Process_Events
Start Trace_Back
ADD REPLY

Login before adding your answer.

Traffic: 2154 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6