While annotating the small RNA data using GTF file, we come across a few entries that have the gene name of miRNA, but the type of gene is lincRNA. Can anyone clarify this please?
Also, I extracted the region in question from the GTF file. I have pasted it below.
chr5 HAVANA gene 148809849 148812397 . + . gene_id "ENSG00000269936.2"; transcript_id "ENSG00000269936.2"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "MIR145"; transcript_type "lincRNA"; transcript_status "NOVEL"; transcript_name "MIR145"; level 2; havana_gene "OTTHUMG00000184101.1";
chr5 HAVANA transcript 148809849 148812397 . + . gene_id "ENSG00000269936.2"; transcript_id "ENST00000602315.1"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "MIR145"; transcript_type "lincRNA"; transcript_status "KNOWN"; transcript_name "MIR145-001"; level 2; havana_gene "OTTHUMG00000184101.1"; havana_transcript "OTTHUMT00000468029.1";
chr5 HAVANA exon 148809849 148812397 . + . gene_id "ENSG00000269936.2"; transcript_id "ENST00000602315.1"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "MIR145"; transcript_type "lincRNA"; transcript_status "KNOWN"; transcript_name "MIR145-001"; exon_number 1; exon_id "ENSE00003379897.1"; level 2; havana_gene "OTTHUMG00000184101.1"; havana_transcript "OTTHUMT00000468029.1";
chr5 ENSEMBL transcript 148810209 148810296 . + . gene_id "ENSG00000269936.2"; transcript_id "ENST00000384967.1"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "MIR145"; transcript_type "miRNA"; transcript_status "KNOWN"; transcript_name "MIR145-201"; level 3; tag "basic"; havana_gene "OTTHUMG00000184101.1";
chr5 ENSEMBL exon 148810209 148810296 . + . gene_id "ENSG00000269936.2"; transcript_id "ENST00000384967.1"; gene_type "lincRNA"; gene_status "NOVEL"; gene_name "MIR145"; transcript_type "miRNA"; transcript_status "KNOWN"; transcript_name "MIR145-201"; exon_number 1; exon_id "ENSE00001499974.1"; level 3; tag "basic"; havana_gene "OTTHUMG00000184101.1";
As we can see, all the lines of gtf file have gene_type "lincRNA" and gene_name "MIR145".
Also checked by entering the coordinates in UCSC, there is nothing visible when lincRNA track is enabled, but we can see mir145 when sno/miRNA track is enabled.
Basically, while we count the gene features using HTSeq, MIR145 is reported as linkRNA and it counts it as linkRNA instead of miRNA.
Do we consider it as linkRNA or miRNA?
Thats precisely the reason I posted this question. Because when we count using HTSeq and gencode v19 gtf file, we get mir145 as lincRNA. This is becuase in gtf file the gene type is lincRNA. So, I am wondering, if it is infact a lincRNA or miRNA!!!!
From what I understand the gene is classified as a lincRNA (has a long primary transcript seq), but the transcript is a miR as well as the long precursor.