I am looking for the gff file for the chloroplast of Tetradesmus obliquus strain DOE0152z. Why is it so hard to find on NCBI? Can somebody point me in the right direction please?
I am looking for the gff file for the chloroplast of Tetradesmus obliquus strain DOE0152z. Why is it so hard to find on NCBI? Can somebody point me in the right direction please?
The GFF3 file format is used to represent annotation of the genome - like where the genes, transcripts, exons, etc are located in the genome. For this particular organism, there are two genome assemblies available at NCBI: GCA_900108755.1 and GCA_002149895.1. The submitters of these genome assemblies did not provide any annotation, so there is no data to create a GFF3 file. While RefSeq genome assemblies (with GCF_ accession prefix) are always annotated, unfortunately, there is no RefSeq genome assembly for this organism. Perhaps an external organization or lab annotated the genome of this organism in which case you can download the GFF3 file from there but I do not know of any.
Previous comments are right about the criteria expected for an assembly have certain complementary data like GFF.
BUT
I think you should check the snpEff documentation (https://pcingola.github.io/SnpEff/se_introduction/), because the input files are not regular GFF files, but actually VCF and/or BED.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
https://www.ncbi.nlm.nih.gov/nuccore/NEDT01000001.1
I don't see how I can get the gff file
GFF files may only be available if the genome was annotated by NCBI.
That said this genome is for the organism you mention above but is likely not the same strain. Only GenBank format annotation seems to be available for this genome. You could convert that to a GFF file.
This is the same strain. How can I convert it to a gff file?
Any tools converting Genbank format to GFF3 format?
If it is those genomes that you are interested in, neither of them are annotated by the genome sequence submitter. In that situation, what do you expect to see in the GFF3 file? Unfortunately, there is no RefSeq genome assembly for this organism, so RefSeq did not annotate this genome.
I have some variants I want to annotate using snpEff in the chloroplast genome. I need a gff file.