Hi everyone,
I'm pretty new with miRNA analysis. The data I have comes from a SMARTer small RNA Seq kit for Illumina (50 cycles).
The first step in the library prep is to polyadenylate the input RNA and then adapters are added. In the Sequencing guidelines section says that a significant proportion of reads will end with stretches of As (up to 15 As) wich should be trimmed prior to mapping. So if I'm right, I have to remove the Poly A tales from the sequences and automatically any adapters (or pieces of them) will be removed as they are "after" the poly A tails. Also, the kit instructions specify that the first three nucleotides of the first sequencing read are derived from the template-switching oligo, so these three nucleotides must be trimmed prior mapping. I used cutadapt and did this:
I cut the first three nucleotides from each read. I used the -A option to trim Poly A tails, I searched in mirBase for mircoRNas with As sequences in order to be sure I wasn't trimming sequences that could map to microRNAs. Then I trimmed every sequence with 7 As (or more).
The next step was to eliminate sequences with less than 16 nucleotides and then cut every sequence to a length of 28 nucleotide.
How bad am I doing this? I don't understand exactly how does the algorithms that bowtie (for example) uses for the alignment and how these extra As or extra nucleotides (after the microRNA sequence) can affect miRNA mapping.
Thanks a lot for your answers.
Hi, I have analyzed quite a few CLIP-seq data but not small RNAs. I would suggest to use STAR. It is fast, robust and works even without trimming. Here are a few things you could try - trim or untrimmed, map the reads to straightaway to transcriptome/ncRNAs or genome with GTF annotation. And then compare the statistics in Log.final.out files.
By the way, I'll be doing library prep my own and still haven't decide which kit to use. SMARTer small RNA Seq Kit looks promising. Could we have a private discussion about this?