Hi, I have the small RNAseq results for a human sample, where I mostly care about seeing what the top sequences are, and try to classify them (tRNA, rRNA, mRNA, lnc, etc).
I'm running into an issue when mapping straight to human genome (HISAT2) where most reads are multi mapped to highly conserved regions, so the gene/region counts (htseq-count) are wildly off when comparing the results to the "top 100 overly represented sequences" I get from the FASTQC report.
I need to keep the original sequence information at hand, so after mapping to a genome or RNA database I need to search to what each sequence in my "top 100" matched to.. is there a way for me to do run these queries against my bam file? Or maybe a better way of doing this?
Thanks!