Removing low quality ends from a fastq file
0
0
Entering edit mode
9.2 years ago
Varun Gupta ★ 1.3k

Hi Everyone,

I am working with small RNA - seq data sets from ENCODE and I used cutadapt to quality trim the adapters as well as low quality bases. The file I am working with, has Illumina encoding of 1.5. I can see after using cutadapt, I still have reads which have N's at their ends. Now since after using cutadapt, I want to use bowtie with 0 mismatch options, these reads are thrown out, which I think is not right. So how in my cutadapt step can I trim those N's from the end using cutadapt.

I am working with this sample

https://www.encodeproject.org/experiments/ENCSR000CUU/

and the command I used for cutadapt was

cutadapt -q 15 -b AAAAAAAAAAAA -m 17 input.fq > output.fq

Of course I can use bowtie2 in local alignment mode after using cutadapt so that end bases are soft clipped and the reads still align, but since I chose the quality to be 15, I think N's should be trimmed from the ends. How to deal with this issue, since I might loose significant number of reads.

Regards
Varun

RNA-Seq trimming • 2.9k views
ADD COMMENT
0
Entering edit mode

If the Ns are always within a certain number of bases of the end of the read, and occur with very high frequency, it might be a good idea to just do a fixed trim on every single read.

ADD REPLY

Login before adding your answer.

Traffic: 2314 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6