Extracting matching reads by read ID
2
0
Entering edit mode
3.0 years ago

What tool would you recommend to compare two BAM files and extract matching reads by read ID?

BAM • 2.0k views
ADD COMMENT
1
Entering edit mode

Without extracting read names, doing the comparison outside the BAM? filterbyname.sh from BBMap would be an option. You can come up with a clever way of using pipes/process redirection. May post an example later.

ADD REPLY
0
Entering edit mode

Mostly looking for performance-savvy solutions (and general inspiration if there's not a specific tool that would do it)

ADD REPLY
0
Entering edit mode

well, to be fair, I was mostly searching for a clever way to actually compare two BAM files directly, but it seems I'll have to go via extracting the read names first and then use those for subsetting (which is well covered in those posts)

ADD REPLY
2
Entering edit mode
3.0 years ago
GenoMax 148k
samtools view file1.bam | awk -F "\t" '{print $1}' | sort | uniq  > names_in_file1

filterbyname.sh -Xmx4g in=file2.bam names=names_in_file1 out=file.fq.gz include=t 

file.fq.gz will include reads that are common in both files.

ADD COMMENT
0
Entering edit mode

nice, except that I'd prefer a BAM file in the end, but I think that's an option for filterbyname.sh

ADD REPLY
0
Entering edit mode

Correct. You can simply use out=filtered.bam.

ADD REPLY
1
Entering edit mode
3.0 years ago
GenoMax 148k

There is this: https://genome.sph.umich.edu/wiki/BamUtil:_diff

@Pierre also seems to have tool for this: Comparison between .bam files

BAM file comparison

ADD COMMENT

Login before adding your answer.

Traffic: 1504 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6