Best way to deal with overlapping read names in merged BAM files
1
0
Entering edit mode
7 months ago
shpak.max ▴ 50

I encountered an error "Value was put into Pairinfo Map more than once" while running picard's MarkDuplicates on a bam file that was created by merging files from three different runs.

I found several threads on Biostars addressing this issue, e.g.

Markduplicates: Value Was Put Into Pairinfomap More Than Once

however, I'm not sure how and where to best implement the suggested corrections of fixing the SM tag (e.g. one reply suggests prefixing each read id with a tag to indicate lanes, or presumably any other identifier).

Is this something that can be performed on the three bam files that I generated before merging (perhaps with an appropriate samtools function), or do I need to go back to my fastq and write a script to rename each read? Since this seems to be a common problem, I assume that there must be some simple fix?

picard MarkDuplicates • 721 views
ADD COMMENT
0
Entering edit mode
7 months ago

I'm not sure how and where to best implement the suggested corrections of fixing the SM tag

use to change the read group of one or more bam samtools addreplacerg.

ADD COMMENT
0
Entering edit mode

Could you please direct me to an example which uses addreplacerg across an entire bamfile (i.e. I would need to add a read group tag for each read based on the original run/bamfile that it came from, it's not clear to me from the addreplacerg documentation how to do this).

Additionally, I found another discussion thread on this topic that suggested the use of the samtools flag function - is this a sound approach?

samtools view -f 0x2 -b in.bam > out.bam
ADD REPLY
1
Entering edit mode

not tested:

samtools addreplacerg -r "@RG\tID:ID1\tSM:SAMPLE1  -O BAM -o new1.bam old1.bam
samtools addreplacerg -r "@RG\tID:ID2\tSM:SAMPLE2  -O BAM -o new2.bam old2.bam
ADD REPLY
0
Entering edit mode

Thanks. If I understand correctly, this adds ID1 to the rg header in the first bam, etc. Could you please explain the syntax of the tsM:SAMPLE ?

ADD REPLY
0

Login before adding your answer.

Traffic: 2341 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6