I had 421 vcf files, I want to convert the all vcf files to the ped file (format of plink),
and I want to merge these 421 ped files to a ped file, what is the best should I do?
Thanks a lot!
I had 421 vcf files, I want to convert the all vcf files to the ped file (format of plink),
and I want to merge these 421 ped files to a ped file, what is the best should I do?
Thanks a lot!
This isnt what worked for me, but I complied the snippets here for posterity.
this thread was very helpful, thank you all. Medhat's link has moved over the years, it is here now: https://zzz.bwh.harvard.edu/plink/dataman.shtml#mergelist
This is what worked for me:
gatk CombineGVCFs \
-R Homo_sapiens_assembly38.fasta \
--variant myfile1.g.vcf.gz
--variant myfile2.g.vcf.gz \
-O cohort.g.vcf.gz
gatk GenotypeGVCFs \
-R Homo_sapiens_assembly38.fasta \
-V cohort.g.vcf.gz \
--annotations-to-exclude InbreedingCoeff \
-O corhort.vcf.gz
Then following Kevin Blighe's post here: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thanks for the syntax. But, please specify version of PLINK used for conversion and merging. I am working on exome VCF files from IonProton. Converted VCFs using PLINK1.9 to PED/MAP files. But when I try to merge the file-set (PED/MAP) using PLINK1.9 it gives error about multiallelic SNPs. I already removed multiallelic records from VCF files using bcftools
Still it gives the same error. Any further help on this Many thanks
If you use the latest plink 1.9 build, the multiallelic-SNP error message will include the URL for https://www.cog-genomics.org/plink/1.9/data#merge3 .