How to convert multiple vcf files to a ped file?
2
3
Entering edit mode
8.2 years ago
line1438 ▴ 40

I had 421 vcf files, I want to convert the all vcf files to the ped file (format of plink),

and I want to merge these 421 ped files to a ped file, what is the best should I do?

Thanks a lot!

plink vcftools • 7.4k views
ADD COMMENT
4
Entering edit mode
8.2 years ago
Medhat 9.8k

using plink
in the directory that contains your vcf files run this command "assuming that plink in the path"

for i in *.vcf;  do f="${i%.*}"; plink --vcf $i --maf 0.05 --recode --out $f".ped";done

example:

plink --file fA --merge-list allfiles.txt --make-bed --out mynewdata 
ADD COMMENT
0
Entering edit mode

Thanks for the syntax. But, please specify version of PLINK used for conversion and merging. I am working on exome VCF files from IonProton. Converted VCFs using PLINK1.9 to PED/MAP files. But when I try to merge the file-set (PED/MAP) using PLINK1.9 it gives error about multiallelic SNPs. I already removed multiallelic records from VCF files using bcftools

bcftools view -m2 -M2 -v snps input.vcf

Still it gives the same error. Any further help on this Many thanks

ADD REPLY
0
Entering edit mode

If you use the latest plink 1.9 build, the multiallelic-SNP error message will include the URL for https://www.cog-genomics.org/plink/1.9/data#merge3 .

ADD REPLY
0
Entering edit mode
3.6 years ago
DNAvinci • 0

This isnt what worked for me, but I complied the snippets here for posterity.

this thread was very helpful, thank you all. Medhat's link has moved over the years, it is here now: https://zzz.bwh.harvard.edu/plink/dataman.shtml#mergelist

ADD COMMENT
0
Entering edit mode

This is what worked for me:

gatk CombineGVCFs \
   -R Homo_sapiens_assembly38.fasta \
   --variant myfile1.g.vcf.gz
   --variant myfile2.g.vcf.gz \
   -O cohort.g.vcf.gz

gatk GenotypeGVCFs \
   -R Homo_sapiens_assembly38.fasta \
   -V cohort.g.vcf.gz \
   --annotations-to-exclude InbreedingCoeff \
   -O corhort.vcf.gz

Then following Kevin Blighe's post here: Produce PCA bi-plot for 1000 Genomes Phase III - Version 2

ADD REPLY

Login before adding your answer.

Traffic: 1947 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6