If one of the gene isoforms (or one of the transcripts) is lncRNA, can we just claim that this gene is a lncRNA gene?
In GENCODE lncRNA GTF file, for each gene, there is always a row (the first row for that gene) which has the transcript_id equal to the gene_id. What does this row represent?
I’d appreciate your advice.
Thank you very much in advance!
Indeed, many genes have non-coding transcripts. An open question is whether these are biological noise or serve a function. The contrary is also true, rare cases exist in which genes annotated as non-coding turn out to produce peptides. You can find some papers about this.
Thank you so much for your advice!
So you mean if a gene is named as an lncRNA gene, all of its isoforms (transcripts) have to be lncRNAs, right?
As for the GENCODE lncRNA GTF, the file description is: “Long non-coding RNA gene annotation: It contains the comprehensive gene annotation of lncRNA genes on the reference chromosomes.” So can I safely assume that all the gene_id’s (or gene_name’s) that appear in this GTF file are lncRNA genes? In other words, it is not possible that some protein coding genes appear in this GTF file just because they have lncRNA isoforms (transcripts), right?
Thank you very much!