What is wrong with NCBI's gff file?
1
0
Entering edit mode
6.3 years ago
rimgubaev ▴ 340

This is a part of gff file for Nicotiana tabacum (tobacco) genome fasta. I must say that the fact that mRNA, gene, 1st exon and 1st CDS start at the same location (2309) is quite confusing! Any ideas?

NW_015787321.1  Gnomon  gene    2309    5631    .   +   .   ID=gene43;Dbxref=GeneID:107775605;Name=LOC107775605;gbkey=Gene;gene=LOC107775605;gene_biotype=protein_coding
NW_015787321.1  Gnomon  mRNA    2309    5631    .   +   .   ID=rna56;Parent=gene43;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;Name=XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  exon    2309    3923    .   +   .   ID=id448;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  exon    3967    5133    .   +   .   ID=id449;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  exon    5273    5631    .   +   .   ID=id450;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XM_016595671.1;gbkey=mRNA;gene=LOC107775605
NW_015787321.1  Gnomon  CDS 2309    3923    .   +   0   ID=cds47;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XP_016451157.1;Name=XP_016451157.1;gbkey=CDS;gene=LOC107775605
NW_015787321.1  Gnomon  CDS 3967    5133    .   +   2   ID=cds47;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XP_016451157.1;Name=XP_016451157.1;gbkey=CDS;gene=LOC107775605
NW_015787321.1  Gnomon  CDS 5273    5631    .   +   2   ID=cds47;Parent=rna56;Dbxref=GeneID:107775605,Genbank:XP_016451157.1;Name=XP_016451157.1;gbkey=CDS;gene=LOC107775605
GFF • 1.8k views
ADD COMMENT
1
Entering edit mode

I'd recommend you add (to your post by editing it) the common name of the organism as well as why you think what you're seeing is weird.

ADD REPLY
6
Entering edit mode
6.3 years ago

Gene, mRNA and Exon starting with the same coordinate is normal. The CDS features starting at the same coordinate suggest that the 5'UTR wasn't annotated for this gene. This is probably due to this gene being computationally predicted rather than annotated with biological evidence (Ie. RNA-seq).

ADD COMMENT
0
Entering edit mode

Now it's clear, thanks!

ADD REPLY

Login before adding your answer.

Traffic: 1997 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6