Question

Is trimming necessary for RNAseq?

1

Entering edit mode

4.1 years ago

Pac314 ▴ 10

I am performing quality control on some reads that have good FastQC metrics apart from duplication levels and per base sequence content. Is it necessary to trim these reads before alignment and downstream DE analysis, variant calling etc? I have read conflicting advice about trimming before RNAseq analysis and not sure what to do.

RNA-Seq trimming fastqc • 5.6k views

ADD COMMENT • link updated 14 months ago by ATpoint 86k • written 4.1 years ago by Pac314 ▴ 10

3

Entering edit mode

Most aligners will soft-clip the reads removing sequence that does not match so trimming is not strictly necessary. If you are going to do de novo assembly then you should scan/trim the data to remove these extraneous sequences.

ADD REPLY • link 4.1 years ago by GenoMax 148k

0

Entering edit mode

Ok, as I am not performing a de novo assembly, I will perform an alignment using the untrimmed reads first.

ADD REPLY • link 4.1 years ago by Pac314 ▴ 10

1

Entering edit mode

I don't perform trimming at all but I do apply some filters for genes with really low read counts. Also, a results validation helps a lot! But, you can perform some tests and compare results with trimmed and not trimmed data, this will give you a really good direction on what to do!

ADD REPLY • link 4.1 years ago by brunobsouzaa ▴ 840

0

Entering edit mode

I have just checked the per tile sequence quality has a warning in a couple of my fastqc files along with sequence duplication and per base sequence content, but only a few tiles are yellow, is it safe to ignore these warnings? When I run trimmomatic on these reads the per tile sequence quality passes, but then I get warnings for sequence length distribution and sequence duplication levels.

ADD REPLY • link 4.1 years ago by Pac314 ▴ 10

1

Entering edit mode

Please check these informative blog posts by authors of FastQC:

https://sequencing.qcfail.com/articles/libraries-can-contain-technical-duplication/
https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/

ADD REPLY • link 4.1 years ago by GenoMax 148k

0

Entering edit mode

Thank you for sharing these articles, they are really helpful!

ADD REPLY • link 4.1 years ago by Pac314 ▴ 10

0

Entering edit mode

How do you filter to remove low read counts before DE analysis?

ADD REPLY • link 14 months ago by xiaoleiusc ▴ 140

0

Entering edit mode

See: https://support.bioconductor.org/p/65256/
https://support.bioconductor.org/p/110833/

ADD REPLY • link 14 months ago by GenoMax 148k

0

Entering edit mode

I like filterByExpr from edgeR for most analysis.

ADD REPLY • link 14 months ago by ATpoint 86k

1

Entering edit mode

I personally use salmon for alignment/quantification and it was generally beneficial to trim adapters to maximize alignment rate, but only if you in fact have adapter contamination according to fastqc. If not or low (like 1% or so) then no, probably not necessary or beneficial.

ADD REPLY • link 4.1 years ago by ATpoint 86k

0

Entering edit mode

Is adapter contamination inferred from the presence of over-represented sequences? I don't have any over-represented sequences.

ADD REPLY • link 4.1 years ago by Pac314 ▴ 10

0

Entering edit mode

Not sure about the exact procedure internally but there is a slot "adapter content" that lights up if you have contamination above a certain percentage.

ADD REPLY • link 4.1 years ago by ATpoint 86k

0

Entering edit mode

FastQC has 2 files containing adapter and contamination sequences: adapter_list.txt and contaminant_list.txt. You can go to ~/[YOUR_WORKSPACE]/FastQC/Configuration to check them.

ADD REPLY • link 4.1 years ago by gabrielafg ▴ 60