Hello,
I have selected for certain chromosomes from my bam file. Now, I would like to change the header as I want to do the SNP calling, which means I want the header to only contain the chromosomes that I have selected in the bam file.
One way I know is to first take all the lines that start with @SQ, then intersect it with the chromosome set I want and then put back together the other parts of the header and then use samtools reheader, but I was wondering if there is any easier way to do this, that is basically changing the bam header based on the set of chromosomes you've selected.
Thanks in advance
More simply, since the BAM file was already subset, simply doing nothing would suffice. The SNP caller isn't going to call variants where there are no alignments. But yeah, giving the variant caller a BED file is the simplest solution overall and allows you to skip subsetting the BAM file as well.
Thank you so much for this. What about my FASTA reference, the FASTA reference obviously contains all the sequences in the bam file header, would that work too or I should provide a subset of my reference for the chromosomes I chose?
The fasta file should match the header (in fact, it should really be the exact file that you mapped against). I should note that not abiding by this rule can theoretically lead to completely meaningless results.
Perfect, thanks a lot, I will be extra careful about these issues now.