I would like to double check whether to use VCF concatenate or VCF merge on my chromosome files. I have done SNP calling using FreeBayes, but split this by chromosomes in order to call SNPs in parallel. I also split some particularly large chromosomes by chromosome position with no overlap, e.g. 1:500,000; 500,001:1,000,000, with the end result being:
Chromosome1-position-1:500,000
Chromosome1-position-500,001:1,000,000
Chromosome2
Chromosome3
I want to first combine the separate VCF files for single, extra data heavy chromosomes that have been split into multiple VCF files for processing. I.e.
Chromosome1-position-1:500,000
+ Chromosome1-position-500,001:1,000,000
--> Chromosome1
Then I want to combine all of the separate VCF files into 1, i.e.
Chromosome 1 {merged from step 1}
+ Chromosome 2
+ Chromosome 3
--> SNPs
The VCF tools manual leads me to believe that using VCF-concatenate
is the appropriate command for both as they are all separate files of separate chromosomes that I just need to re-attach back to each other but I'm unsure if this is the case. Any advice would be appreciated.
Not what you are asking for, but the currently recommended tool for things like this is
bcftools
, and no longer VCFtools.