Hi,
I have an RNA-seq
bam
file with ambiguously mapped reads, which I'm converting to a bed
file using bedTools
's bamToBed
with the -splitD
option in order to show each mapping segment separately.
There's no argument in bamToBed
that allows adding a field that would map the record in the bed
file to its record in the bam
file so that in the bed
file I can distinguish between split
segments of the same bam
record and different bam
records.
Here's an example read with 3 records in the bam file:
chr start end id strand cigar
chr1 9010230 9011300 read_1 + 379S21M1D270M1I59M1D22M1I308M1D388M401S
chr1 9010230 9010522 read_1 + 1557H293M
chr1 7823080 7823376 read_1 + 87H31M3D3M2I2M1I41M2D8M3D3M1D200M1472H
So bamToBed
without the -splitD
argument will give the 3 records above (without the cigar
field), whereas bamToBed
with the -splitD
argument will give:
chr start end id strand
chr1 9010230 9010250 read_1 +
chr1 9010252 9010580 read_1 +
chr1 9010582 9010911 read_1 +
chr1 9010913 9011300 read_1 +
chr1 9010230 9010522 read_1 +
chr1 7823080 7823110 read_1 +
chr1 7823114 7823159 read_1 +
chr1 7823162 7823169 read_1 +
chr1 7823173 7823175 read_1 +
chr1 7823177 7823376 read_1 +
It's not straight forward to map the split
bed
records to their corresponding unspilt
bed
records without a bam record field.
Any idea if there's a way to get such a field out of bamToBed or any other way to distinguish between the ambiguously mapped read records in the split
bamToBed
file?
I initially though that adding the -cigar argument to bamToBed will help because that can serve as a unique read mapping ID but bamToBed
with the -splitD
argument ignores the -cigar
argument.
Perhaps
bam2bed --split
? Ref. https://bedops.readthedocs.io/en/latest/content/reference/file-management/conversion/bam2bed.htmlThanks! I ran
bam2bed
with the--split
option and it's not splitting mybam
records. Perhaps the format of mybam
is not whatbam2bed
expects?I ran:
Here's what I'm getting for the example
read_1
above: