Hello!
I have 1000 genome data per chromosome. It is easy if I give an example.This will be one of the files I have: ALL.chr1.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz
There is another file that contains a reduced number of individuals that are related to some individuals inside the file I just mentioned. I want to merge both files. The file I want to merge to the first file is this one: ALL.chr1.phase3_shapeit2_mvncall_integrated_v5_related_samples.20130502.genotypes.vcf.gz
I tried to merge them using the following command:
vcf-merge \
ALL.chr1.phase3_shapeit2_mvncall_integrated_v5.20130502.genotypes.vcf.gz \
ALL.chr1.phase3_shapeit2_mvncall_integrated_v5_related_samples.20130502.genotypes.vcf.gz | \
bgzip -c > ALLchr1.vcf.gz
It starts running but it takes very long and It never finishes. I thought maybe I can transform the files to PED format and then use PLINK to merge them. I don't know if that is feasible and faster. Can anybody recommend me a faster way? Another software? Maybe another way to do it?
Thank you!