Hi Everyone,
I am working with small RNA - seq data sets from ENCODE and I used cutadapt to quality trim the adapters as well as low quality bases. The file I am working with, has Illumina encoding of 1.5. I can see after using cutadapt, I still have reads which have N's at their ends. Now since after using cutadapt, I want to use bowtie with 0 mismatch options, these reads are thrown out, which I think is not right. So how in my cutadapt step can I trim those N's from the end using cutadapt.
I am working with this sample
https://www.encodeproject.org/experiments/ENCSR000CUU/
and the command I used for cutadapt was
cutadapt -q 15 -b AAAAAAAAAAAA -m 17 input.fq > output.fq
Of course I can use bowtie2 in local alignment mode after using cutadapt so that end bases are soft clipped and the reads still align, but since I chose the quality to be 15, I think N's should be trimmed from the ends. How to deal with this issue, since I might loose significant number of reads.
Regards
Varun
If the Ns are always within a certain number of bases of the end of the read, and occur with very high frequency, it might be a good idea to just do a fixed trim on every single read.