Entering edit mode
5.2 years ago
evelyn
▴
230
Dear all,
I wanted to split sorted bam
file chromosome wise and put the header on each split file. I want to do the variant calling for multiple samples chromosome wise as I have a lot of samples which will take a long time to process altogether. I did that using SAM
file earlier but I am not sure if sorted bam
files can be used for such a job or not?
Thank you very much!
yes, you can split a bam file also chomosome wise if that is the question.
You don't need to split the bam file to do the variant calling per chromosome.
Because of the multiple large size samples, it will take a long time for variant calling. That's why I want to split the files chromosome wise.
You can do the variant calling separately per chromosome without splitting the bam. Bam allows random access. Which variant caller are you using?
I am using
bcftools
:Can it work for multiple bam files together for each chromosome? My whole point is to reduce computational time.
Then look at
bcftools call -r/-R
Thank you! I tried using:
But I got an error:
Then I tried:
Again I got an error:
I am not sure which file format to use now. Thank you for your help!
If you're worried about the variant calling taking a very long time, most variant callers (GATK/freebayes) can run on many threads, making the process faster. There are other ways to make variant calling faster, calling chromosome by chromosome is not standard.
I am not sure about the standard ways to make variant calling faster for multiple samples with bcftools. Can you please share if you are aware of any such way? Thank you so much!