Entering edit mode
8.0 years ago
Calvin
▴
80
Dear everyone,
I have a question about my data
The length of my small RNA read raw data is 51bp and the length is consistent
I used cutadapt to trim one of my small RNA sample data with known 3’ end adapter.
cutadapt -a TGGAATTCTCGGGTGCCAAGG -o [Output] [Input]
The result shows that only around 41% of reads has adapters. What about the rest of reads? Can we consider them as small RNA since they might be longer?
Regards
First off, +1 for L.
So - can you clarify your experiment? How long are your reads, what platform did you use, are they paired or single-ended, and what size range are you expecting?
Hi Brian
My reads are all 51 nucleotides. Illumina but not sure whether is Miseq or Hiseq. They are all single-ended. Im expecting to see roughly 22 to 32 nts, miRNA and piRNA respectively.
You may want to increase sensitivity and see if that increases the trim rate substantially. You can also map the reads and see what happens... if a lot of them don't map, or map with a bunch of mismatches/clipping on the 5' end, that indicates adapters are still present. Otherwise... well, they'd probably be RNAs longer than 51 bp :)
Hi Brian
After i mapped the adapter untrimmed reads (51bp) by using bowtie2, only 31% are aligned. For adapter trimmed reads (0-51bp length), only 33.49% are aligned. Is it weird?
I guess you need to find out what the other 60+% of the reads are. Is the reference complete and unmasked? If so, they are likely contamination. But it sounds like (31% of) your 51bp reads are probably RNAs longer than 51bp; you might be getting a slightly lower alignment rate than trimmed reads because some of them have untrimmed adapter sequence.
Hi Brian
It turns out to be those 60% of the reads are contamination. However, since I have four duplicates of this sample that have less contamination, i guess it won't affect my downstream analysis so much.