Remove duplicate readnames from a bam file
0
0
Entering edit mode
5.8 years ago

Hello, I have a problem where when using picard markduplicates i get the error: 'Value was put into PairInfoMap more than once' I grep out the read causing the issue because it (for some reason) has a duplicate readNAME. I rerun picard markduplicates and still get the error 'Value was put into PairInfoMap more than once'. This time however, it is because of a different duplicate read name.

I have analysed countless paired end data and never encounter this problem before.

My question: Is there a way to get a list of ALL the duplicated readnames so i can filter them all out?

Bw, Ian.

atac-seq duplicates BAM duplicate read names • 3.2k views
ADD COMMENT
0
Entering edit mode

Hi fin swimmer, thank you for your reply. Both links I have previously read but i will go through them more thoroughly again. I did triplicates of ATAC-seq across a bunch of cell lines and for some i have no problems and others i do which is what is so frustrating.

ADD REPLY
0
Entering edit mode

filter your BAM files to contain only primary alignments - perhaps having the read reported with multiple alignments is the source of the problem.

ADD REPLY
0
Entering edit mode

Have done this, still has the same Error.

ADD REPLY

Login before adding your answer.

Traffic: 1663 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6