Hi!
I hope all of you and your loved ones are doing well, while we are all working remotely during the covid era.
I am trying to find somatic mutations for WES in mm10 using strelka2 by the following command (as suggested in their github):
$configureStrelkaSomaticWorkflow.py --normalBam /Dir_to_normal_bam/My_final_normal.bam --tumorBam /Dir_to_tumor/My_final_tumor.bam --referenceFasta /Dir_to_mm10/mm10.fa --runDir demo_somatic --exome
It keeps complaining with the following err:
CONFIGURATION ERROR:
Reference genome mismatch: Reference fasta file is missing a chromosome found in the normal BAM/CRAM file: 'chr1_GL456210_random'
Needless to say, I have already sorted
, re-indexed
, extracted only known chromosomes into new bam files
, and indexed the new bam files
using samtools
for normal and tumor bam files, using the following snippet:
samtools sort my_normal_bam.bam -o sorted_my_normal_bam.bam && samtools index sorted_my_normal_bam.bam && samtools view -o My_final_normal.bam sorted_my_normal_bam.bam `seq 1 19 | sed 's/^/chr/' && echo X | sed 's/^/chr/' && echo Y | sed 's/^/chr/'`
Any suggestion / comments would be highly appreciated
You got it! Extremely helpful hint! thank you! I checked the bam header and all of those unknown chrs are there.
The only way I can think to get rid of the unknown/random chr in the header, is to:
sed -i '/^\@SQ.*\_/d' my_sam_file.sam
)This is tedious though. do you have any suggestion?
Thank you so much!
Or try the method described in : C: How to change a BAM file so the chromosome identifier is "chr 1" not just "1" or How to change SM tag in bam file
I did not know about the
samtools reheader
. this was very helpful!thank you so much!