Entering edit mode
8.1 years ago
int11ap1
▴
490
I am performing a gene annotation on an assembly and, for a specific loci, I have the following support:
Pp05 blastn expressed_sequence_match 11506941 11508983 2043 + . ID=Pp05:hit:646548:3.2.0.2;Name=asmbl_43336
Pp05 est2genome expressed_sequence_match 11506941 11508983 10215 + . ID=Pp05:hit:656896:3.2.0.2;Name=asmbl_43336
Pp05 cdna2genome expressed_sequence_match 11507201 11509028 9136 + . ID=Pp05:hit:666951:3.6.0.2;Name=asmbl_12674
Pp05 tblastx translated_nucleotide_match 11507201 11509028 3245 - . ID=Pp05:hit:665604:3.6.0.2;Name=asmbl_12674
Pp05 snap match 11507276 11508937 95.824 + . ID=Pp05:hit:683530:4.5.0.2;Name=snap-Pp05-abinit-gene-2.249-mRNA-1
Pp05 protein2genome protein_match 11507330 11508925 1009 + . ID=Pp05:hit:676148:3.10.0.2;Name=sp|Q9FYG4|GLOX1_ARATH
Pp05 blastx protein_match 11507333 11508925 1129 + . ID=Pp05:hit:669247:3.10.0.2;Name=sp|Q9FYG4|GLOX1_ARATH
Pp05 blastx protein_match 11507363 11508925 1009 + . ID=Pp05:hit:669248:3.10.0.2;Name=sp|Q3HRQ2|GLOX_VITPS
Pp05 protein2genome protein_match 11507363 11508772 821 + . ID=Pp05:hit:676149:3.10.0.2;Name=sp|Q3HRQ2|GLOX_VITPS
Why this is not annotated as a gene by maker?
EDITED:
In maker_opts.ctl I have:
#-----Genome (these are always required)
genome=/Synology/final_assembly.fasta #genome sequence (fasta file or fasta embeded in GFF3 file)
organism_type=eukaryotic #eukaryotic or prokaryotic. Default is eukaryotic
#-----Re-annotation Using MAKER Derived GFF3
maker_gff= #MAKER derived GFF3 file
est_pass=0 #use ESTs in maker_gff: 1 = yes, 0 = no
altest_pass=0 #use alternate organism ESTs in maker_gff: 1 = yes, 0 = no
protein_pass=0 #use protein alignments in maker_gff: 1 = yes, 0 = no
rm_pass=0 #use repeats in maker_gff: 1 = yes, 0 = no
model_pass=0 #use gene models in maker_gff: 1 = yes, 0 = no
pred_pass=0 #use ab-initio predictions in maker_gff: 1 = yes, 0 = no
other_pass=0 #passthrough anyything else in maker_gff: 1 = yes, 0 = no
#-----EST Evidence (for best results provide a file for at least one)
est=/Synology/104_Lt_2.assemblies.fasta #set of ESTs or assembled mRNA-seq in fasta format
altest=/Synology/104_ppersica_transcripts.assemblies.fasta #EST/cDNA sequence file in fasta format from an alternate organism
est_gff=/Synology/104_Lt_2.pasa_assemblies.renamed.gff3 #aligned ESTs or mRNA-seq from an external GFF3 file
altest_gff=/Synology/104_ppersica_transcripts.pasa_assemblies.renamed.gff3 #aligned ESTs from a closly relate species in GFF3 format
#-----Protein Homology Evidence (for best results provide a file for at least one)
protein=/DATA/SwissProt_2016_09/uniprot_sprot_plants.fasta #protein sequence file in fasta format (i.e. from mutiple organisms)
protein_gff= #aligned protein homology evidence from an external GFF3 file
#-----Repeat Masking (leave values blank to skip repeat masking)
model_org= #select a model organism for RepBase masking in RepeatMasker
rmlib= #provide an organism specific repeat library in fasta format for RepeatMasker
repeat_protein= #provide a fasta file of transposable element proteins for RepeatRunner
rm_gff= #pre-identified repeat elements from an external GFF3 file
prok_rm=0 #forces MAKER to repeatmask prokaryotes (no reason to change this), 1 = yes, 0 = no
softmask=1 #use soft-masking rather than hard-masking in BLAST (i.e. seg and dust filtering)
#-----Gene Prediction
snaphmm=/Synology/snap_3/p_dulcis.hmm #SNAP HMM file
gmhmm= #GeneMark HMM file
augustus_species= #Augustus gene prediction species model
fgenesh_par_file= #FGENESH parameter file
pred_gff= #ab-initio predictions from an external GFF3 file
model_gff= #annotated gene models from an external GFF3 file (annotation pass-through)
run_evm=0 #run EvidenceModeler, 1 = yes, 0 = no
est2genome=0 #infer gene predictions directly from ESTs, 1 = yes, 0 = no
protein2genome=0 #infer predictions from protein homology, 1 = yes, 0 = no
trna=0 #find tRNAs with tRNAscan, 1 = yes, 0 = no
snoscan_rrna= #rRNA file to have Snoscan find snoRNAs
snoscan_meth= #-O-methylation site fileto have Snoscan find snoRNAs
unmask=0 #also run ab-initio prediction programs on unmasked sequence, 1 = yes, 0 = no
#-----Other Annotation Feature Types (features MAKER doesn't recognize)
other_gff= #extra features to pass-through to final MAKER generated GFF3 file
#-----External Application Behavior Options
alt_peptide=C #amino acid used to replace non-standard amino acids in BLAST databases
cpus=40 #max number of cpus to use in BLAST and RepeatMasker (not for MPI, leave 1 when using MPI)
#-----MAKER Behavior Options
max_dna_len=5000000 #length for dividing up contigs into chunks (increases/decreases memory usage)
min_contig=1000 #skip genome contigs below this length (under 10kb are often useless)
pred_flank=200 #flank for extending evidence clusters sent to gene predictors
pred_stats=0 #report AED and QI statistics for all predictions as well as models
AED_threshold=1 #Maximum Annotation Edit Distance allowed (bound by 0 and 1)
min_protein=0 #require at least this many amino acids in predicted proteins
alt_splice=0 #Take extra steps to try and find alternative splicing, 1 = yes, 0 = no
always_complete=0 #extra steps to force start and stop codons, 1 = yes, 0 = no
map_forward=0 #map names and attributes forward from old GFF3 genes, 1 = yes, 0 = no
keep_preds=0 #Concordance threshold to add unsupported gene prediction (bound by 0 and 1)
split_hit=10000 #length for the splitting of hits (expected max intron size for evidence alignments)
min_intron=20 #minimum intron length (used for alignment polishing)
single_exon=0 #consider single exon EST evidence when generating annotations, 1 = yes, 0 = no
single_length=250 #min length required for single exon ESTs if 'single_exon is enabled'
correct_est_fusion=0 #limits use of ESTs in annotation to avoid fusion genes
tries=2 #number of times to try a contig if there is a failure for some reason
clean_try=0 #remove all data from previous run before retrying, 1 = yes, 0 = no
clean_up=0 #removes theVoid directory with individual analysis files, 1 = yes, 0 = no
TMP= #specify a directory other than the system default temporary directory for temporary files
Could you show us the maker_opts.ctl file please ?
I have edited my post with the content of maker_opts.ctl.
Do you have any Maker result un your output ? Did the Maker run finish correctly ? You could load the différent tracks (est2genome, protein2genome, snap) in a genome browser. It often helps to understand.
It looks like the gene model synthesis didn't work properly. You should relaunch using the option -t 10 to be sure that everything went well. (Keep everything as it is in your folder when you re-run Maker).
I am facing similar problem. How did you solve yours? I would be grateful if you could share the solution. Thanks.