Hi everyone.
I am trying to split a bam file in order to obtain only a bam file of a chromosome 19. I tried with both samtools and bamtools.
With samtools:
samtools view -b /dbfs/mnt/vg/wgs_bam%2FNA12878_20k_hg38%2FNA12878.bam chr19 > /dbfs/mnt/vg/wgs_bam%2FNA12878_20k_hg38%2FNA12878.chr19.bam
And bamtools:
bamtools split -in /dbfs/mnt/vg/wgs_bam%2FNA12878_20k_hg38%2FNA12878.bam -reference
In both cases a header of split file includes all chromosomes. I tried also with other bam files the same thing and always get the same split output which header contains all chromosome instead of only chromosome 19. Would be very grateful for any advice.
Thank you a lot.
What do you mean by :
Can we have a visual representation of this ?
Thank you a lot! This is exactly what I overlooked. For a specific variant caller, I need to modify a header of bam file in order to contain only information of chromosome 19. Any advice on how to do it? Thank you a lot.
This is AFAIK not necessary. Keep in mind that most variant callers offer options to only use reads of a certain regions or chromosome, so there is often no need to filter the BAM file. Which one do you use?
It was recently developed, based on neural network, NeuSomatic. It seems to me that header of reference and bam files must match completely.
I think that should never be a requirement since it forces people to mess with the header of their bam files. Maybe it is worth to ask the authors of the package to change this. Anyway, if you really want to do this, here's a way (at your own risk):