What is the difference between mRNA
and transcript
record types in the GFF3 (and GTF) format?
We have some gene models with mRNA
records , some with transcript
records and some with both mRNA
and transcript
records for the same gene.
The transcript
and mRNA
record types seem to fulfill the some function, grouping which exons
/CDS
/UTRs
belong together in transcript of a gene.
But why is then sometimes mRNA
used, sometimes transcript
and sometimes the combination of both?
Is there some additional meaning to the (combined) usage of 1 of these 2 types?
See also the record types of these 2 example gff3 files.
GFF file 1
cut -f 3 gene_model1.gff3 | sort -u
CDS
exon
five_prime_UTR
gene
##gff-version 3
mRNA
three_prime_UTR
transcript
GFF file 2
cut -f 3 gene_model2.gff3 | sort -u
###
CDS
exon
five_prime_UTR
gene
##gff-version 3
mRNA
three_prime_UTR
tRNA
My guess is that an mRNA has a CDS, where as a transcript doesn't have to have a CDS. You can have some genes that have both coding and non-coding transcripts, so some of those transcripts will be called
mRNA
and sometranscript
.This indeed seems to be the case in the 'GFF file 2' gene model. Need to check if this is the case consistently across all gene models we have from diverse sources. (afraid this isn't the case)
Do you have an example of a gene model that has both mRNA and transcript entries?