Hi, I received sequences from RNAseq for mirna from a company. I have following questions regarding the sequences. As company provided following sequencing kit was used https://www.neb.com/faqs/2012/07/30/which-is-the-sequence-of-the-final-pcr-product
eg. of sequnce
Overrepresented sequences
Sequence Count Percentage Possible Source
2733553 24.346957954015203 No Hit
GGCTGGTCCGATGGTAGTGGGTTATCAGAACT 1389076 12.372094108631375 No Hit
TCCTGTACTGAGCTGCCCCGA 498271 4.437954225400096 No Hit
What is the difference between "adapter" and "index" sequences ? How to trim those sequences using trimmomatic or cutadapt ? The index sequences are different in different samples. However, few sequences show no index sequences but it should be there ?
What are the empty sequences shown in "over represented sequences" ? HOw to remove those ?
How do I make sure that I remove necessary adapters, index sequences using cutadapt or trimmomatic. The sequence length is 37 or more bp and after removing those adapter and index the final sequence length was still more than 25 bp. For miRNA, the sequence length should be short if I am not wrong. So, I think I am doing something wrong
Take a look at the Trimmomatic manual, chapter "ILLUMINACLIP":
http://www.usadellab.org/cms/uploads/supplementary/Trimmomatic/TrimmomaticManual_V0.32.pdf