small RNA is a large world :) so it would be nice to know a little more about the goals of the project.
Regardless, in the past I tested piPipes which is more focused on piRNAs but will give you results for other classes of small RNAs. It takes some time to set-up for species that is not included in their bundle, but it might be worth it specially if you are starting this analysis with little experience. It will output a ton of results, but you can explore those at your leisure.The caveat is that you will need to clean-up the reads before using piPipes, by which I mean remove UMIS (if using), and trimming adapaters. For the trimming I use cutadapt
, but other tools are available. I am not suing the tool just because my project required something a little more customized.
Now straight to your questions.
My question is I should keep the mapped reads and excluding unmapped reads or I should keep unmapped reads and excluding mapped reads for next step analysis.
It really depends on how you are assigning reads to features. If you are mapping directly to the genome and then intersecting with a list of features, discard unmapped reads and keep only mapped (reduces the size of the BAM file). If on the other hand you are going with stratified approach of first mapping to say rRNA/tRNA, then map to miRNA, (..), and only then to the genome, in each of the steps both mapped and unmapped reads need to be kept. Say in the first step you will get the mapped reads to rRNA, and use the unmapped for the alignment to miRNA.
Other question is about MiRNA, rRNA, and piRNA database. I donot know I should use which database is correct for sorghum.
I never worked with plants, so I am of little help here. However, mirBASE seems like good place to start for miRNAs. No idea about piRNAs.
Bonus answer
What you should also be thinking about, specially working with piRNAs, is about reads mapping to unique or multiple locations in the genome. There is no right answers for this, but there are some wrong ones. Avoid keeping all or multiple locations for a single read, and if do so please proceed with care in the downstream analysis since the results might be overestimated.