Entering edit mode
7.4 years ago
noeD
▴
130
Hello!
I need human gff3 file of non coding smallRNA annotation (e.g. miRNA, piRNA, lncRNA, siRNA...).
I have found this gtf file: ftp://ftp.ensembl.org/pub/current_gtf/homo_sapiens/ (Homo_sapiens.GRCh38.89.chr.gtf), in which I am sure that there are both miRNA and lncRNa. I looked into gene_name but I didn't find piRNA, siRNA...
Are there these smallRNA in that gtf file? Where can I find the list of gene_name of smallRNA reported in the gtf file?
Moreover, how can I extract only smallRNA from that list in order to use it to annotate my alignment?
Thank you in advance
Best
Have you looked at this: Finding gff3 or gtf files for small non-coding RNA and GTF/GFF for non-coding RNA
Thank you for your help. I have seen that link but I didn't find how extract only smalllRNA from that list...
That file has miRNA and lncRNA but does not ave piRNA or siRNA. You can get the piRNA from RNAcentral. I don't see any coordinates for piRNA or any siRNA at RNAcentral for human.
Thank you. I have another question: are the miRNA reported in the gff3 file predicted or validated?
This page describes the annotation procedure for Non-coding RNA's at Ensembl. snRNA are included in that GTF file.
Thank you. I am wondering if it is better to align against miRNA reference (from miRbase) and after that take the unaligned reads in order to align against genome and annotate them with that gff file... because if I align against genome and after that I annotate the read aligned with that gff3 I will use miRNA predicted... Do you have any suggestion? Thank you in advance!