I'm trying to call variants for a large number of bam files, but continue to get the following error:
##### ERROR contig reads is named chr9 with length 138394717 and MD5 6c198acf68b5af7b9d676dfdd531b5de
#### ERROR contig reference is named chr9 with length 138394717 and MD5 addd2795560986b7491c40b1faa3978a.
I haven't seen these errors before on any postings, where the length is the same and the MD5 is different. These bam files came from TCGA, so theoretically they were aligned to hg38, and I used the hg38 reference to variant call.
Any help?
Thanks, I downloaded the reference file from TCGA. Should I be concerned that the MD5 from this file doesn't match either of the MD5s that were in my error?
From your link: GRCh38.d1.vd1.fa.tar.gz md5: 3ffbcfe2d05d43206f57f81ebb251dc9
From my error: contig reads: 6c198acf68b5af7b9d676dfdd531b5de contig reference: addd2795560986b7491c40b1faa3978a.
You can also try checking the BAM file header (
samtools view -H file.bam
). It may have the file name of the FASTA file. Maybe it is notGRCh38.d1.vd1.fa
.