my ultimate goal is to use the genewise option in popoolation to evaluate allele frequency variation at mitochondrial snps between two divergent populations. I need to convert the gff file to a gtf file first. So I have buried myself in different ways of annotating the GFF3 file to get popoolation to read it, and I get nowhere. So I have tried several gff converters. The original GFF3 file comes from MITOS or sometimes I import that into Geneious and export it again. I have worked for days on different annotation types, with lines for parents included or omitted. Nothing seems to work. I have been using the GenomeTools website to validate the file and I do find some errors. It looks like the command line version of GenomeTools would work a lot better, but its hard for me to figure out how to properly install it in macOS. Here's the output of that command (I used the biostars installation to get conda installed a few weeks ago and have been able to install other programs that way. But I just get errors when I try to install GenomeTools.
My ultimate goal is to get a gff that will convert to a gtf that will be readable by popoolation. So if there is another route to that goal without needing GenomeTools I would be happy to take it.
$ conda install genometools
Collecting package metadata (current_repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): done
Solving environment: failed with initial frozen solve. Retrying with flexible solve.
Solving environment: \
Found conflicts! Looking for incompatible packages.
This can take several minutes. Press CTRL-C to abort.
failed
UnsatisfiableError: The following specifications were found
to be incompatible with the existing python installation in your environment:
Specifications:
- genometools -> python=2.7
Your python: python=3.7
If python is on the left-most side of the chain, that's the version you've asked for.
When python appears to the right, that indicates that the thing on the left is somehow
not available for the python version you are constrained to. Note that conda will not
change your python version to a different minor version unless you explicitly specify
that.
(base)
When installing bioinformatics tools with conda, do not install them at the base environment, create an environment for the tool - environments will prevent this sort of conflict between incompatible program versions.
The first command will create an environment called genometools, and install genometools in this environment. The second command will activate this environment for use.
In the end, I worked with AGAT, gffread, and Kent's tools to get the gtf. I had to simplify the gff to get one of them to work properly, and once I had a version that would run, I was able to add complexity back to the gff and then convert to gtf.
I downloaded a gff file from a close relative to my study organism
uploaded by a friend to Genbank.
I manually edited the MITOS output gff in BBEdit to match the format
in the Genbank gff. Then I deleted the duplicate records for the RNA
molecules, while keeping the record 'gene' and 'CDS' for the protein
coding sequence.
I repeatedly checked my progress with the website GFF validator
(http://genometools.org/cgi-bin/gff3validator.cgi) and kept editing
and simplifying until my GFF passed the test with the green text
'Validation successful.'
At that point, I could run Kent's tools (UCSC) to get a GTF I could import into popoolation.
And then I needed to edit the Gff slightly and now I am stuck again. I can get the genepred file and even a gtf file but popoolation won't read it. I tried to change things in such a way as to prevent a problem from emerging but failed.
Use AGAT toolkit to do the conversion as shown here: A: converting .gff file to .gtf
When installing bioinformatics tools with conda, do not install them at the base environment, create an environment for the tool - environments will prevent this sort of conflict between incompatible program versions.
The first command will create an environment called genometools, and install genometools in this environment. The second command will activate this environment for use.
This is probably the easiest solution because it will also install python-2.7+ for you.
Thank you I got it installed that way
Try
gffread
from condaIn the end, I worked with AGAT, gffread, and Kent's tools to get the gtf. I had to simplify the gff to get one of them to work properly, and once I had a version that would run, I was able to add complexity back to the gff and then convert to gtf.
Feel free to post the steps and commands you used as an answer, it may help people in a similar situation as yours.