Fastq deduplication levels
2
0
Entering edit mode
19 months ago

Hi Guys,

I was checking the quality control parameters of a sample (DNA fastq ; paired end, sequenced on Illumina platform) using fastqc toolkit. I noticed that fastqc put a red cross on "Sequence Duplication Levels" panel, showing "percent of seqs remaining if deduplicated 35.06%". I have attached the graph here. Please have a look. Please help me to interpret it and do I need to remove the duplicated sequence ?

Fastqc report Thanks, Shivangi

Sequence levels deduplication • 900 views
ADD COMMENT
0
Entering edit mode
19 months ago

The answer to the question is far more complicated than we assume, I tried twice to explain it

and I still don't quite remember what it really means - unless I re-read what I wrote

Revisiting the FastQC read duplication report

So What Does The Sequence Duplication Rate Really Mean In A Fastqc Report

ADD COMMENT
0
Entering edit mode
19 months ago
GenoMax 147k

Please check this blog post from authors of FastQC: https://sequencing.qcfail.com/articles/libraries-can-contain-technical-duplication/

What is the intended application for your data? If the genome you are working with has a lot of repetitive sequence then this may be a perfectly normal result.

ADD COMMENT

Login before adding your answer.

Traffic: 3038 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6