Filtering fastq (by quality score and length), optimum criteria?
2
1
Entering edit mode
9.2 years ago

For filtering fastq files (of RNAseq data) (by quality score and length) in galaxy, what are the optimum criteria?

i.e. the min and max size, the min and max quality and Maximum number of bases allowed outside of quality range.

My datasets are from human samples, Hiseq2000, paired end experiment (2 separate files per sample).

RNA-Seq galaxy • 4.0k views
ADD COMMENT
2
Entering edit mode
9.2 years ago

Assuming that you are using some alignment process (and not de novo assembly), we generally do not filter and trim tails only very lightly. The alignment process itself is a great filter.

ADD COMMENT
0
Entering edit mode

Thank you alot

ADD REPLY
0
Entering edit mode
9.2 years ago
Ian 6.1k

As a core facility we generally run our sequences through Trimmomatic to remove adaptor sequence (most important to avoid mapping errors) and trim reads when a moving 4nuc window has a mean quality score below 20.

ADD COMMENT

Login before adding your answer.

Traffic: 1949 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6