Hi,
I have haplotype sequences in fasta format of several genes of a population with different length. I am mapping them against the genes' reference sequences. I would like to extract regions from the mapping files which are covered by all the haplotype sequences but I don't want the overlaps. And the regions should be > 1kb. So I want to iterate over the file and extract regions bigger than 1kb with a coverage of x (x is the number of haplotype sequences I've got). I need to compare the sequences but I need them to be all of the same length. Is there a method to crop the overlaps when extracting a region from a bam file?
Many Thanks!