I have a pair-end BAM file generated by MAC2 that includes read pairs (i.e. mates), and some reads that exist individually.
What I want to do now is to merge each read pair (i.e. each pair of mates) that overlap each other with overlapping regions and generate a single long read (e.g. read 2a and 2b shown below); keep the mates that do not overlap each other (read 1a and 1b shown below), as well as keep the 'oligo' reads that exist alone;
and finally, all the directional features are removed from all reads to generate a new single-end BAM file.
Can someone tell me how to do it? It should be possible to achieve the above functions through code, but I don't have any good ideas, including how to make the newly generated BAM file properly applied to IGV and downstream analysis.
I know there are tools, like PEAR, that can merge overlapping mates in pairs of fastq (R1 and R2), but this is not the functionality I need. I would prefer not to align the 'new' single-end BAM file again to the genome since all the reads are already aligned.
Many thanks!