Entering edit mode
11.5 years ago
14134125465346445
★
3.6k
Are there any tools to reduce two bam files to the difference between one and another, dropping all reads that are identically aligned in both? Something like GATK ReduceBAM, but for pairs of files against each other.
Same reads in two bam file can differ in the locations they are aligned to. They can also differ in the number of mismatches they have when aligned to slightly different version of the same genome. Though i dont know any tool but I do have some scripts that may work for you depending on your need.
Where can I download those scripts?
Hi You will have to tell me exactly what you need. Right now, my script takes two bam files that are sorted using queryname. Both the bam files should have the same number of reads irrespective of whether they are aligned or not. My script will go through a pair of reads in both the bam files and depending on certain criteria will output reads to different files.
Well, from your description, it is nothing like ReducedBAM, which collapses multiple reads into one pseudo-read. It is not dropping reads at all.
Would you mind providing some context? What are you trying to achieve?
Bamutil diff does what I want (see other answer). It's rather slow though, so I posted another question about it.