I would like to detect all possible types of non-coding RNA (microRNA, snoRNA, tRNA etc) in my eukaryotic organism sequence.
Are there any annotation platforms or tools that will help me with this task ?
I would like to detect all possible types of non-coding RNA (microRNA, snoRNA, tRNA etc) in my eukaryotic organism sequence.
Are there any annotation platforms or tools that will help me with this task ?
Have you aligned your reads to the reference genome? With a SAM or BAM file you will have genomic coordinates for each sequencing read. Then you can compare these coordinates with those found in a GTF file for your organism. You can filter the GTF file for only ncRNA features:
awk '$3=="ncRNA"' organism.gtf > ncRNA.gtf
You can then compare the filtered GTF file with your BAM file using a tool like bedtools intersect.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
First, Infernal could be a starting point. Second, do you have an annotation of protein-coding genes for your organism? That together with some RNAseq data can also give you some hints about potential ncRNAs.
Are you sure that all kinds of non-coding RNA will be caught by the library prep technique used?
It is unclear which data you have, please elaborate.
I have the genome sequence.
goodez, your method sounds rather simple. I have been asked to identify all ncRNA as well, but my gtf is empty after running
awk '$3=="ncRNA"' organism.gtf > ncRNA.gtf
. There doesn't seem to be any occurences of ncRNA in my original gtf. Any suggestions?What does your gtf look like? Do you see anything if you
grep "ncRNA" organism.gtf
?HI genomax, thanks for replying. Yes, I get returns for grep "ncRNA", looks like I need to grep lincRNA instead. Thanks for the help!