Mark duplicates the bam files sorted by coordinates
0
0
Entering edit mode
3.2 years ago
priya.bmg ▴ 60

Hello

As it is mentioned in the documentation (https://gatk.broadinstitute.org/hc/en-us/articles/360037224932?page=1#comment_4406762304155), it is ideal to submit the query name based sorted bam files, so will it be computationally intensive process to submit the coordinated based sorted bam files?

First, I sorted the unmapped and mapped bam files by queryname and merged these files and then sorted by coordinates. Can these merged bam files which are sorted by coordinates be used to mark duplicates by spark? Also, subsequently run SetNmMdAndUqTags before running BQSR.Please advice

Thanks

Spark duplicates Mark • 745 views
ADD COMMENT
0
Entering edit mode

From your link:

This can result in the tool being up to 2x slower processing under some circumstances.

Is what it says there... so probably negligible unless you need your results yesterday...

ADD REPLY

Login before adding your answer.

Traffic: 2558 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6