Hi,
I am attempting to analyse some small RNA sequencing data produced using an Illumina TruSeq Small RNA Library Preparation Kit. The RNA was isolated from sheep serum. Pre-sequencing QC was fine and post sequencing looked good too - apart from unusual sequence length distribution. We were mostly expecting miRNAs and so peaks at ~20-25nt but instead we have peaks at 44bp and 60-62bp:
Additionally when the tool miRDeep2 was used to map these to the Ovis aries genome, we are getting extremely low mapping rates (sub .1%). I also tried blasting some of the sequences and was getting no results. Is this kind of sequence length distribution telling of any particular issue? We thought it may have been poor trimming but there appears to be no adapter contamination. Any advice is much appreciated!
Many thanks, CW.
Hello,
does this kit includes enrichment steps and adds unique molecular identifier (UMI) to the fragments or something similar?
Hi Olli,
We didn't use the kit ourselves it was undertaken by the university core genomics unit but I don't believe there was UMI added - I will ask the core genomics unit to confirm if this was the case.
Did you look for presence of specific small RNA adapter sequence for TrueSeq kit (
TGGAATTCTCGGGTGCCAAGG
) first? Any reads that do not have this adapter are likely not useful/usable. Once you find the reads that contain this sequence you will need to trim the adapters and then use the remaining read to align using an ungapped alignment.Did you resolve your problem? I have this profile and I don't know if le length corresponds to miRNA + the sequence adapter. Does anybody knows?
Answer is probably. Check which kit was used for the library prep and then trim the data.