I have used GATK's MarkDuplicates on a BAM file I obtained after alignment, which resulted in another file marked_duplicates.bam. So should I proceed with this marked_duplicates.bam file for analysis (converting to VCF), or this is just a file containing duplicates? In the latter case, is it possible to obtain a BAM file, with all the duplicates removed?
So if I did not use --REMOVE_DUPLICATES, the duplicate reads will still be present in the marked_duplicates.bam but they would have been flagged as duplicates right?
yes