Question

Trimming read length & quality in a FASTQ file

0

Entering edit mode

9.9 years ago

ChIP ▴ 600

Hi all,

I was having a different computational problem, I had to compare a single end RNA-seq to paired end RNA seq. So my approach was to take the forward strand reads from the paired end sample and trim the read length and quality score of this fastq file and then map it using BWA, followed by regular stuff of estimating RPKM.

But how can I trim the read length and quality score?

any one liners in perl or awk to do this or is their something in picard tools?

Please share your experience and knowledge.

Thank you

RNA-Seq next-gen • 9.3k views

ADD COMMENT • link updated 2.1 years ago by Ram 45k • written 9.9 years ago by ChIP ▴ 600

2

Entering edit mode

https://github.com/lh3/seqtk

http://www.usadellab.org/cms/?page=trimmomatic

In case, both of your paired end reads are in the same file first separate them and then trim the forward reads.

Suggestion: If your goal is to compare these two set of files for the end results (RPKM) in your case I don't think there is any point in just considering forward reads. You can compare a fragment library with a paired-end library and compare their complexities, contamination etc. And please use Splice aware aligner like TopHat or STAR if you want to count the reads spanning exons.

ADD REPLY • link updated 2.7 years ago by Ram 45k • written 9.9 years ago by Ashutosh Pandey 12k

0

Entering edit mode

If the pairs are in separate files, you could just work on each file and substr() on awk when NR%2==0, no?

ADD REPLY • link 2.1 years ago by Ram 45k

Ram · Answer 1 · 2015-05-02

1

Entering edit mode

9.9 years ago

gufernandez10 ▴ 10

Hi I'm doing something similar, I'm using to trim reads features with cutadapt, is a good option and easy to use.

ADD COMMENT • link updated 2.1 years ago by Ram 45k • written 9.9 years ago by gufernandez10 ▴ 10