microRNA-seq sequence lenght distribution after trimming
0
0
Entering edit mode
5.9 years ago
szabo.marton ▴ 10

Hi! I'm doing a microRNA-seq analysis. After trimming and checking the results in FastQC the sequence length distribution panel shows two peaks at 24 nt and 36 nt. Here is my trimming command: cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -e 0.1 -O 5 -o H28_trim_test3.fastq H28.fastq

after aligment with mirdeep2 (with the help of the mapper.pl + quantifier.pl modules) I've got 0.62/0.38 mapped/unmapped ratio.

I'm just wondering if this aligment ratio is acceptable for mammalian microRNAs or should I improve the mapped ratio? Also if I can abolish somehow the 36 nt peak will that improve the mapping ratio? I think yes, but I can't figure out the way yet.

alignment rna-seq • 1.6k views
ADD COMMENT
2
Entering edit mode

This alignment rate is common. For what I remember, yes, the mapping rate of the 24bp is higher than that of the 36bp, but also the composition: 24bp corresponds mainly to miRNA, while 36bp correspond to piRNA.

You could filter out the longer reads from the sam / bam file (possibly using samtools view and then awk, there are solutions to similar problems here on BioStars), or you could filter the fastq and map all over again. But I wouldn't worry about them.

P.S.: probably reformat.sh from the BBMap/BBTools package can filter by length both sam / bam and fastq.

ADD REPLY
0
Entering edit mode

Thanks your advice! I'll check about that for details. That would be nice if we had sequenced piRNAs as well, but in my case the read lenghts are only 36 nt, so I guess the chance for full or at least reliable piRNA sequences is very low. Also the lab guys performed a size selection for 24 nt before performing the sequencing.

ADD REPLY
1
Entering edit mode

What is the read length, which kit was used and are you sure this adapter is correct? This is the standard TruSeq, not the standard smallRNA adapter if I am not wrong. Please give some details.

ADD REPLY
0
Entering edit mode

Thanks your answer! The read length was 36 nt. I was told to use Trimmomatic for discarding adapter sequences, because it contains the adapter sequences, however it did nothing (maybe I did something wrong). Therefore I searched for the TruSeq adapters and I found that adapter sequence. About the kit, I don't know what were used exactly, however the sequencing was performed at Illumina platform with MiSeq reagent kit + NEBNext products were used for the RNA preparation.

ADD REPLY
2
Entering edit mode

smallRNA preps generally require special handling of the data (mainly trimming of the adapter). Check the documentation for the exact kit used for instructions on how to handle the data.

ADD REPLY
0
Entering edit mode

I checked the kit, we have used NEBNext Adaptors and Primers for Illumina (NEB #E7300), here are the ducomentation with the adapter sequences: https://international.neb.com/-/media/catalog/datacards-or-manuals/manuale7300_e7330_e7560_e7580.pdf#page=24 Anyway I'm still confused what sequence should I trimm.

ADD REPLY
2
Entering edit mode

Depending on how this kit works (e.g. direct adapter ligation to RNA) you may want to retain only those reads that have the adpater noted by @ATPoint and then trim the adapter off to get your RNA.Those would be the reads of interest. You can do this easily with bbduk.sh from BBMap.

ADD REPLY
1
Entering edit mode

As this is smallRNA, I would go for the smallRNA adapter sequence. Use TGGAATTCTCGGGTGCCAAGG. Check with fastqc towards adapter content.

ADD REPLY

Login before adding your answer.

Traffic: 2524 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6