Are these reads all PCR replicates ?
3
0
Entering edit mode
8.7 years ago

Hi,

I did an sequencing run where I enriched specific DNA regions. Thus I expect to have a lot of PCR duplicates. In the figure below you can see a IGV print screen of a specific region. You can see that pretty all reads are the same. But several of them has some mismatches (see arrows) (less than 1% of the reads have some mismatches). Can I consider that they are PCR duplicates ? or are they real different DNA fragments ?

enter image description here

Thanks

dna-seq • 2.3k views
ADD COMMENT
0
Entering edit mode

I think that checking the Phred score for those bases that are different can give you some insights. But in general I don't think there's an easy way to know if two reads are PCR duplicates.

ADD REPLY
0
Entering edit mode

they seems to have phred score between 15 and 20. But several of them have good phred score (>30)

ADD REPLY
0
Entering edit mode

What sequencer was used? Do the differences occur in a homopolymer region?

ADD REPLY
0
Entering edit mode

We used a miSeq and it's not a homopolymer region

ADD REPLY
0
Entering edit mode

Which polymerase enzyme was used in PCR?

ADD REPLY
1
Entering edit mode
8.7 years ago
surendra ▴ 30

Hi,

You can use Picard tools to identify the PCR duplicates (with option MarkDuplicates)

http://broadinstitute.github.io/picard/command-line-overview.html#Overview

ADD COMMENT
0
Entering edit mode

It looks like Haloplex data - the last thing you want to do is run MarkDuplicates on it. This is a terrible idea.

ADD REPLY
0
Entering edit mode
8.7 years ago
Jenez ▴ 540

Those differences could have easily arisen during the sequencing of the fragments, as no sequencing machine is flawless and will produce erroneous sequencing reads.

ADD COMMENT
0
Entering edit mode
8.7 years ago
stolarek.ir ▴ 700

Looking at this picture it looks like sequencing error <- more or less random between sequences, however

In aDNA we observe lots of fixed errors in some portion of the reads, the possible explanation for that is that those mismatches if present in the exactly same place in probable duplicate read come not from sequencing error, but either from polymerase error during PCR or from sample contamination. To test if it really was a polymerase you can do analysis cycle by cycle.

ADD COMMENT

Login before adding your answer.

Traffic: 2622 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6