Entering edit mode
9.2 years ago
billzt
▴
20
Hi to all. I have a Illuminna pair-end reads mapping result in bam format. I use picard to remove PCR duplicate reads:
java \
-jar picard.jar \
MarkDuplicates \
INPUT=a.sortpos.bam \
OUTPUT=a.rmdup.bam \
METRICS_FILE=a.rmdup.log \
REMOVE_DUPLICATES=true \
MAX_FILE_HANDLES_FOR_READ_ENDS_MAP=1000
Then I use bamToFastq (from bedtools) to extract the reads from the output file "a.rmdup.bam". However bamToFastq warns a lot of unpaired reads. My raw input reads were all exactly paired. Therefore it must be picard's removing duplicate reads leading to unpaired, that it, it might only remove one read but left its mate.
How to deal with this problem?
What exactly is the problem you want to solve? Are you wanting Picard's Remove Duplicates tool to remove both the duplicate read and its mate?