Hi all,
it is my first time to analyse miRNA data. I have some miRNA data , species: bos_turus, single end, read length is 75bp, I double checked with sequencing guy, they said I should trim adapter of the Illumina HiSeq 2000 miRNA protocol, 3' trimmed. I have tried to trim adapter using this command:
trim_galore -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCAC --stringency 6 001.fastq
Then I mapped my 001_trimmed.fastq using hisat2 . I got only 44.23% overall alignment rate. I have checked my read length distribution after trimming, the read peak is at 44bp.
I have no idea why I got so low mapping read. Could anyone please help me with this?
Many thanks~
Isn't 44bp a bit long for miRNAs? I would expect them to be more in the 20-29 range. I sounds like there is still something else in your reads that isn't RNA. Does your protocol have some adaptor other than the sequecing adaptor? Also you should make sure to throw out anything that doesn't contain the adaptor. All reads that contain an miRNA should contain the adaptor.
The Illumina prep cannot enrich for miRNAs per se, it simply targets the 20-40bp range (roughly) in the gel, and you get what you get. miRNAs will be in there, but are not guaranteed to be the dominant species.
Its been a while since I last did miRNA analysis, but when I did, the size distribution was pretty tight around 22nt. Maybe we didn't use the Illumina prep. This was the size distribution on the last analysis I did.
In fact, I highly recommend the SequenceImp miRNA precessing pipeline. It's suite of QC plots would be very helpful in solving problems like this.