Entering edit mode
6.5 years ago
Sam
▴
150
Dear All
could you introduce an appropriate pipeline to the predicate and identify lncRNA according to RNA-seq data in plants?
Thanks
Just some ideas:
If your data is not ribo-depleted, first filter against an rRNA database, e.g. sortmeRNA.
If you have a reference genome and annotation: map against that, remove all hits against known protein-coding RNA, assemble the remaining reads.
If you don't have a reference genome/annotation, assemble the reads, remove all transcripts with ORFs that yield good matches against protein-coding RNAs, e.g. by blastx against nr.
For identification you could run Infernal on the set of remaining transcripts.