Hi, I need to label some smallRNA sequences that I know are rRNA fragments.
I know that for mRNA these are discarded by aligning to the human genome and filtering out multimapped reads, but I need to try to pin down their origin. (which subunit, maybe a label if it's an interesing segment).
I tried looking around for papers/pipelines for rRNA to see how they handle the labeling, but most mRNA ones just filter them out, and I'm finding getting into the topic quite challenging. I know I'll have to build my own thing anyway due to how our database works, so I'd like to understand the pipeline rather than just using someone else's.
I have my sequences aligned to grch38 so I can map them If I get a rRNA gtf/bed file, but I'm not sure where to get it from. I can also get a fasta file from NCBI and align them directly, but I don't get annotations.
Any advise on how to tackle this? Specific resources and tools are welcome.
Thanks!