Maker Gff3 file issues
1
0
Entering edit mode
6.0 years ago
alslonik ▴ 320

Hi community,

This is really a technical question, I hope it is OK to post it here...

I am trying to import the gff3 file from Maker to my Jbrowse to view the annotations. I am using the maker2jbrowse script and getting constant errors. There are no indications that Maker did produce a problematic file, the logs are w/o errors. Still I am getting this output:

GFF3 parse error: some features reference other features that do not exist in the file (or in the same '###' scope).

Head of my gff3 file:

##gff-version 3
Chr6    .   contig  1   41368575    .   .   .   ID=Chr6;Name=Chr6
Chr6    maker   gene    9418414 9419484 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9;Name=maker-Chr6-exonerate_protein2genome-gene-94.9
Chr6    maker   mRNA    9418414 9419484 594 -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9;Name=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1;_AED=0.31;_eAED=0.43;_QI=0|0|0|1|0|0|2|0|197
Chr6    maker   exon    9418414 9418727 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:exon:382;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   exon    9419205 9419484 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:exon:381;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   CDS 9419205 9419484 .   -   0   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:cds;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   CDS 9418414 9418727 .   -   2   ID=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1:cds;Parent=maker-Chr6-exonerate_protein2genome-gene-94.9-mRNA-1
Chr6    maker   gene    9469345 9471102 .   -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.15;Name=maker-Chr6-exonerate_protein2genome-gene-94.15
Chr6    maker   mRNA    9469345 9471102 588 -   .   ID=maker-Chr6-exonerate_protein2genome-gene-94.15-mRNA-1;Parent=maker-Chr6-exonerate_protein2genome-gene-94.15;Name=maker-Chr6-exonerate_protein

The file is 2.1 Gb large. How do I check for the validity of the file and more importantly how do I fix the file in case it is not valid?

THANKS

genome browser genome annotation JBrowse • 6.3k views
ADD COMMENT
0
Entering edit mode

Hello alslonik ,

I don't know Maker or worked with Jbrowse. But the error message is quite clear to me. In a ggf3 file the value given in Parent= link to an entry in the file where you have the same value in ID=. And this is not always the case in your file.

fin swimmer

ADD REPLY
0
Entering edit mode

Thanks, finswimmer, I understand what you mean. The question is how do I deal with this? Are there any ways to fix this in a gff3 file? Also, maybe it is a matter of sorting the file correctly? I have never worked with gff3 before, hence the questions...

ADD REPLY
0
Entering edit mode

Is your gff3 file the output of gff3_merge without filtering? It appears that you have filtered to keep only source (i.e., 2nd column) as maker. Perhaps you need to redo gff3_merge without filtering it's output to input into JBrowse, as some features of the gff3 file seem to be missing.

ADD REPLY
0
Entering edit mode

Not sure that I understand... Yes, I did:

gff3_merge -d logfile

I did not do any filtering while merging.

ADD REPLY
0
Entering edit mode

Hi all, I highly want to know how to deal with the error "GFF3 parse error: some features reference other features that do not exist in the file (or in the same '###' scope)". I also depressed by the same bug. Any solution? Thanks.

ADD REPLY
0
Entering edit mode

Hi, As people below answered me - you have to find the problem with the file, either using a script kindly provided below, or manually. I ended up opening the file with R tools and correcting it semi-manually.

ADD REPLY
1
Entering edit mode
6.0 years ago
Juke34 8.9k

I got often this kind of problem using MAKER on our cluster (LSF + openMPI). I never succeeded to find where the problem is coming from. I end up with some parent features missing or duplicated features. I have developed a library to standardise any kind of GTF/GFF that fix all kind of problem and produce a full gff3 output.

Clone this repository and install it: [https://github.com/NBISweden/GAAS][1] Then just do: `gxf_to_gff3.pl --gff input.gff -o output.gff` You can even add this option `-v 1` to have a look at what problem is corrected.

You can access it within the toff toolkit AGAT:

agat_convert_sp_gxf2gxf.pl --gff input.gff -o output.gff

Have a look at the check steps to see what are the problems that have been corrected.

ADD COMMENT
0
Entering edit mode

WOW. Thanks, Juke-34 I am going to try it. Al least I am not the only one!!! And I ran Maker with open MPI too... Thank you very much!

ADD REPLY
0
Entering edit mode

Hi, an update. Your script throws me a list of awful errors and no output:

error: 
------------- EXCEPTION: Bio::Root::Exception -------------
MSG: OBO File Format Error - 
Cannot find tag format-version and/ default-namespace . These are required header.

STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:449
STACK: Bio::OntologyIO::obo::_header /usr/share/perl5/Bio/OntologyIO/obo.pm:517
STACK: Bio::OntologyIO::obo::parse /usr/share/perl5/Bio/OntologyIO/obo.pm:208
STACK: BILS::Handler::GXFhandler::try {...}  /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:2892
STACK: Try::Tiny::try /usr/share/perl5/Try/Tiny.pm:92
STACK: BILS::Handler::GXFhandler::_handle_ontology /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:2897
STACK: BILS::Handler::GXFhandler::slurp_gff3_file_JD /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:102
STACK: gxf_to_gff3.pl:76
-----------------------------------------------------------

Let's continue without feature-ontology information.
No data retrieved among the feature-ontology.
=>GFF version parser used: 3

------------- EXCEPTION: Bio::Root::Exception -------------
MSG: [  -   .   ID=Chr7:hsp:23258:3.10.8.137;Parent=Chr7:hit:14758:3.10.8.137;Target=gb|OWM91437.1| 350 575;Gap=M85 I4 M44 I4 M62 I4 M5 D7 M18] does not look like GFF3 to me
STACK: Error::throw
STACK: Bio::Root::Root::throw /usr/share/perl5/Bio/Root/Root.pm:449
STACK: Bio::Tools::GFF::_from_gff3_string /usr/share/perl5/Bio/Tools/GFF.pm:616
STACK: Bio::Tools::GFF::from_gff_string /usr/share/perl5/Bio/Tools/GFF.pm:437
STACK: Bio::Tools::GFF::next_feature /usr/share/perl5/Bio/Tools/GFF.pm:395
STACK: BILS::Handler::GXFhandler::slurp_gff3_file_JD /home/alex/bin/GAAS/annotation/BILS/Handler/GXFhandler.pm:201
STACK: gxf_to_gff3.pl:76

Does this mean that the file is corrupted? What do you think it means about my Maker run? Thanks again...

ADD REPLY
0
Entering edit mode

A line in your file is not 9 columns (i.e: - . ID=Chr7:hsp:23258:3.10.8.137;Parent=Chr7:hit:14758:3.10.8.137;Target=gb|OWM91437.1| 350 575;Gap=M85 I4 M44 I4 M62 I4 M5 D7 M18. You have to fix this line manually. The beginning of the line is probably just at the end of the previous line. I have already seen that, it’s rare but few times one line is split and written over 2 lines .

ADD REPLY
0
Entering edit mode

agat_convert_sp_gxf2gxf.pl is nolonger available on the list of agat tools. I did generate a gff3 file with the maker and my gff files are lacking ##gff-version 3 information in line one. I have tried to add it manually but the gt stat from the genometools keep on failing. Is there a way I can use one of the agat tools to add this information on my MAKER gff file?

ADD REPLY
2
Entering edit mode

? This script still exists: https://github.com/NBISweden/AGAT/tree/master/bin It was not in really old version of AGAT. You should update your AGAT version.

ADD REPLY

Login before adding your answer.

Traffic: 2747 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6