Dear all,
I need to have some view from your great experience in the field of NGS.
We have SOLiD Sequencer to sequence color-space. We have generated library following manufacture protocol and obtained miR-Seq libraries. In order to analyze these seq, first I have trimmed the adapter from 3' and then aligned with genome. Next, I checked quality of library, to do so, I have quantified the genomic co-ordinates of each biotype from Ensembl and plotted percentage in a pie chart (find attachment figure)for QC purpose. Results I found is very disappointing that my each library reached only with few miRNAs and other transcripts (such as protein_coding genes) presented in huge amount.
So I need to figure it out why and how it happened ??
- Due to short miRNA length, is aligner was confused to map in miRNA gene? And can map to other part of genome such as protein_coding or non_coding RNA species?
- Is this really contamination in my libraries? OR in other words my library preparation was not good?
- What is the final destination of those reads which has been mapped to other transcripts?
Note: I have quantified the miRNAs from miRBase gff3 file and most of miRNAs were 0 or almost 0.
Please find attachment here: https://www.dropbox.com/s/cp1f90nwemmnnnr/pie.png?dl=0
Please see my answer below. I am new to Biostars and finding my way around!
Yes we have selected size of libraries (35 bp long) and also checked the quality of RNA before making libraries. These all prerequisite steps were done by following manufacture protocol. And my colleague is pretty sure that quality was very good to pick up small RNA.