Why sequence length disturbution failed after adapter trimming: fastqc?
2
0
Entering edit mode
4.3 years ago
newbie ▴ 130

Dear all,

I have downloaded some already published raw data (fastqs). Initially, I did QC and found adapter content in both forward and reverse reads.

Below you can see the fastqc details before adapter trimming of both forward and reverse reads:

enter image description here

To remove the adapter content I used cutadapt like below:

cutadapt -a AGATCGGAAGAGCACACGTCTGAACTCCAGTCA -A AGATCGGAAGAGCGTCGTGTAGGGAAAGAGTGT -o tr_sample_R1.fastq.gz -p tr_sample_R2.fastq.gz sample_R1.fastq.gz sample_R2.fastq.gz

With adapter trimming I see like below:

enter image description here

So, I have some questions:

1) Before adapter trimming, sequence length distribution was looking fine but after adapter trimming I see that something went wrong. Why is it like that?

2) I see that there is some bias in the first 10-15 bases. What I should do for that? Is it really a problem?

3) Why the GC content have multiple peaks?

Please clarify my doubts. thanks in advance.

RNA-Seq fastqc qualitycontrol adaptertrimming • 2.2k views
ADD COMMENT
0
Entering edit mode
4.3 years ago

I don't think any of this is a problem. You didn't really even have to trim adapters.

ADD COMMENT
0
Entering edit mode
4.3 years ago
Aspire ▴ 370

You can read here https://sequencing.qcfail.com/articles/positional-sequence-bias-in-random-primed-libraries/ about the bias in the first bases.

As to the sequence length distribution, just think of what cutting adapters means... Are reads expected to be of the same length, once you cut adapters, or not?

ADD COMMENT

Login before adding your answer.

Traffic: 2086 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6