Do you know which are the criteria used to classify a gene as misc_RNA
by Ensembl? I couldn't find an answer on the Ensembl page describing non-coding RNA.
A few example of such genes retrieved from BioMart:
- http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000207157;r=13:23726725-23726825;t=ENST00000384428
- http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000242037;r=13:95351479-95351756;t=ENST00000470538
- http://www.ensembl.org/Homo_sapiens/Gene/Summary?g=ENSG00000223298;r=13:95963084-95963209;t=ENST00000411366
These examples show that they are pseudogenes. why aren't they associated to the "pseudogene" gene type? what makes them misc_RNA
?
thank you!
There is some cryptic circularity going on here.
For example I can't provenance RNY3P4 as anything. It appears to be a RefSeq prediction of something http://www.ncbi.nlm.nih.gov/gene?cmd=Retrieve&dopt=full_report&list_uids=100873808 and points back to HGNC - but I wasn't aware they did predictions of psedogenes (they might annotate) so where did this come from?
But ENSG00000207157.1 says "No overlapping RefSeq" clips the RefSeq down from 301 to a 101 exon? on the basis of an Rfam model?
But nothing from Havana/Vega in this location?
We got it from an RFam record. And no, no manual annotation on these guys.
We're into serious "what is a gene" territory here... I might pose it as a general question. It's getting crucial as more equivocal automated and manual annotations keep stacking up.
Thanks for your reply! What is the difference between "type" and "locus_tag". I am looking for all rRNA sequences from GenBank file. I am a little confused why there are many 5S_rRNA are labeled as "misc_RNA"? Such as:
Thanks in advance.