Doubt on removing duplicates on amplicon sequencing data
1
2
Entering edit mode
6.5 years ago
Picasa ▴ 650

Hi,

I am looking to make a simple SNP analysis.

I have different individuals from which we have targeted specific markers. Then the reads I have come from amplicon sequencing. My questions are:

1) Do I have to remove duplicates ? From what I understand, tools like Picard look for the same 5', but by definition, amplicon sequencing reads start by the same position?

2) If no: how can I treat these data, because if if an error is propagate during the PCR, it will be a bad call at the end ?

Edit: 3) There are 2 type of duplicates: optical and pcr, in that case do I have to remove only optical duplicates ? if yes, do you know how ? seems that Picard doest not separate optical and pcr.

Thanks for your help.

duplicates amplicon • 4.6k views
ADD COMMENT
2
Entering edit mode
6.5 years ago
  1. No for the reasons you listed.
  2. Correct, that's the down-side to amplicons (unless you put UMIs on your PCR primers).
  3. Removing optical duplicates can be done with clumpify from BBTools. However, this doesn't end up working that well for amplicons unless you spiked in a lot of PhiX or had a very large number of amplicons on the same lane. Otherwise you end up overly removing sequence (not that this ends up being a huge problem).
ADD COMMENT
0
Entering edit mode

Thanks Devon, I have edited my post with another question. Maybe you have not seen it:

3) There are 2 type of duplicates: optical and pcr, in that case do I have to remove only optical duplicates ? if yes, do you know how ? seems that Picard doest not separate optical and pcr.

ADD REPLY
1
Entering edit mode

I just edited my response accordingly.

ADD REPLY
0
Entering edit mode

Would you have to remove duplicates, when comparing abundance of two transcript isoforms of a gene?

ADD REPLY
1
Entering edit mode

Depends on how badly affected they are, in general if the transcripts are highly enough expressed you're going to start having false-positive duplicates, so it's best to avoid that unless you really need to.

ADD REPLY

Login before adding your answer.

Traffic: 1549 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6