My goal is to a) find novel and known miRNA from the data and 2) perform DGE analysis
I therefore aligned my miRNA reads (after adapter trimming and removing reads less than 18 bp) to the genome of interest but the mapping percentage is extremely low- think 2%. I used miRDeep2's mapper.pl to align, which in turn uses bowtie. I looked into these parameters if I can relax any of them, but they seem reasonable. I proceeded with the next step, miRDeep2.pl, but the number of known miRNA reported is either 1 for some samples or 0.
This happened for all samples in the same run, as well as all samples from another from a different species.
I would like to know what can be the possible reasons for low mapping percentage and if I can use them for finding known/novel miRNA using miRDeep2. The individual quality of the bases is very good, so I don't think that mismatches are reason for this.
UPDATE: I anyway proceeded to use miRDeep2 to look for known/novel miRNA. The result.html file does not report any known miRNA to be present in the data, but the read counts are generated with the name of the known mature miRNA from miRBase I supplied. Now I'm thoroughly confused. Is it advisable to continue with these read counts for DGE?
Thanks.
What aligner are you using? Is it doing ungapped alignments as would be needed for reads ~18 bp?
mapper.pl from miRDeep2 uses bowtie (1). Seed length is 18bp, and it allows no mismatch in this area. It allows 2 mismatches after the seed length and also limits number of mappings reported to 5. Other options are a, -best and --strata.
If you are willing to try a different aligner then you could try using
bbmap.sh
with these parameters (ambig=all vslow perfectmode maxsites=1000
). Sounds like you have followed instructions from miRNA kit (if you used one) to pre-process your data. This is generally important for miRNA data.So you think the problem is with the aligner and not the data? I will try bbmap.sh as you suggested, but I honestly regarded bowtie to be a good aligner to use for miRNA reads. I just followed miRDeep's documentation which also seems perfectly reasonable, so I just assumed the data I have is bad. Thanks!
It is certainly possible that you have bad/less than optimal data. If you get a similar result with bbmap that would bolster that conclusion.
Yes, this was the case! Thanks!
Sheesh, it was just bad data all along. But now I know how to arrive at that conclusion!
Hi, I was doing the same analysis and I got 0, 0, 0.03 and 0.03 mapped reads. None of the reads were mapped at all! I wonder whether bbmap.sh instead of bowtie increased your mapping quality or not. Thanks!
Have you properly trimmed your reads? Most miRNA kits have specific instructions for this step.
Hi!
And the answer is NO! There was some contamination I removed using the BBTools suite, but the mapping percentage was nearly the same. I used the exact same command genomax suggested.