Entering edit mode
6.9 years ago
ahmad mousavi
▴
800
Hi
I have run Bismark for RRBS data, but I think it is not fix duplicates problem.
Anybody know how to correct duplicates problem for Bismark aligner?
I found deduplicate_bismark_alignment_output.pl which works as a post alignment step for DNA methylation analysis in Bismark pipeline.
Thanks
You mean PCR duplicates? You can find them with picard.
You can also use
clumpify
(A: Introducing Clumpify: Create 30% Smaller, Faster Gzipped Fastq Files ) to identify various duplicated without doing alignments.In the documentation it is stated:
deduplicate_bismark --bam [options] <filenames>
"This command will deduplicate the Bismark alignment BAM file and remove all reads but one which align to the the very same position and in the same orientation. This step is recommended for whole-genome bisulfite samples, but should not be used for reduced representation libraries such as RRBS, amplicon or target enrichment libraries."