How to only change/substitute every QNAME(Read ID) from a Bam file?
1
0
Entering edit mode
8.4 years ago
Joe ▴ 30

Hi, I need to substitute a bam file's QNAME to another bam file's QNAME, but without change header, sequence and everything else. Here is Example ( I don't show seq and others here): I have A.bam:

A.bamHeader

M04402:7:000000000-ANGUV:1:1106:13178:9134 147  chr1    17027   0 ...............

M04402:7:000000000-ANGUV:1:1106:13178:9134 99   chr1    17205   0....................

M04402:7:000000000-ANGUV:1:2115:20665:6740 147  chr1    17344   0.................
and so on......

I have another B.bam. the QNAME is

M00601:223:AM07W:1:1102:11074:13214 163 chr1    19754   24.............

M00601:223:AM07W:1:2101:17585:17440 99  chr1    17205   0....................

and so on......

I want to change all the A.bam QNAMEs (eg:M04402:7:000000000-ANGUV:1:1106:13178:9134) to B.bam QNAMEs(eg:M00601:223:AM07W:1:1102:11074:13214). But only change the QNAME, not other things.

What I can think about is use command line "sed".

Thanks!

bam sam QNAME • 3.0k views
ADD COMMENT
0
Entering edit mode

While it can probably be done using the code posted by @i.sudbery I can't figure out the use case. Why do you want to do this?

ADD REPLY
0
Entering edit mode

The reason I want to do this is How to merge two identical BAM files? I want to change QNAME, then redo the thing I post in the link.

ADD REPLY
0
Entering edit mode

I see. You are creating a fake duplicate with same sequence data but different headers. You could just change ANGUV to BNGUV which would make it a new flowcell :-)

ADD REPLY
0
Entering edit mode

It works, thanks a lot

ADD REPLY
2
Entering edit mode
8.4 years ago

It would be easy with pysam in Python, assuming that the order of both BAM files was to be maintained:

from pysam import AlignmentFile
from itertools import izip

bam1 = AlignmentFile("A.bam")
bam2 = AlignmentFile("B.bam")
outbam = AlignmentFile("C.bam", "wb", template = bam1)

for read1, read2 in izip(bam1.fetch(until_eof=True), bam2.fetch(until_eof=True)):
    read1.query_name = read2.query_name
    outbam.write(read1)

outbam.close()
ADD COMMENT

Login before adding your answer.

Traffic: 1070 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6