Entering edit mode
3.5 years ago
irfanwustl
▴
90
Let's assume we have a bam file. Then we made a copy of that bam file. I have sorted them and generated two bai files from that two bam files with the same content but different name. Now I want to check if the Bai files are identical. I have used md5 and the result is different. I am not sure if it is for the different names of the files? Is there a way to see the bai file like sam files?
Why don't you simply diff them?
What does that mean?
Anyway, file names are not taken into account when creating md5 checksums afaik.
Maybe something like CompareBAMs from Picard tools:
The out.txt file will then, in the last column tell you yes/no whether the files are identical plus some extra stats what is identical and what differs on an alignment level.
Output in this case with two identical files dummy BAM files named
A.bam
andfoo.bam
:Mind the
Y
(means Yes, are identical) in the last rightmost column.Actually, I am trying to compare the BAI files, not the BAM files. Can I do this with CompareSAMs?
If the header of the bam files is different, md5 will be different. In this case, the header will be different as the header saves the command for sorting where the file names are different.
See if this Q&A Make bam index human readable helps but I would guess the problem is upstream of the bai files. Maybe the two bam files are not the same...