Hello,
I have a .vcf file produced with GATK, containing SNP variants of 32 samples of tetraploid Triticum species. I would like to convert it with plink in order to get the the .ped file to perform some downstream analysis (PCA, Admixture).
Here is the command I used:
plink --vcf all_triticum_SNPs_250kb.vcf --allow-extra-chr --recode --make-bed --out all_triticum_SNPs_250kb_plink
I got no error and the files are created but the .ped has actually all "0" inside. Might it be a problem with ploydy? Here is how one site looks like in my vcf, with the GT field having 4 values:
Chr1A 51532915 . T C 8409.13 PASS AC=84;AF=1;AN=116;BaseQRankSum=0.579;DP=1899;FS=0;MLEAC=4;MLEAF=1;MQ=60;MQRankSum=0;QD=26.62;ReadPosRankSum=0.012;SOR=1.112 GT:AD:DP:GQ 1/1/1/1:0,49:49:61 0/1/1/1:9,42:51:47 0/0/0/1:129,50:179:76 0/0/0/1:42,13:55:35 0/0/1/1:47,39:86:35 0/1/1/1:19,33:52:1 0/0/1/1:19,23:42:17 0/0/1/1:18,20:38:18 0/0/1/1:27,22:49:19 1/1/1/1:0,128:128:99 1/1/1/1:0,123:123:99 1/1/1/1:0,74:74:92
1/1/1/1:0,40:40:50 1/1/1/1:0,55:55:69 1/1/1/1:0,82:82:99 0/0/1/1:49,35:84:19 1/1/1/1:0,50:50:62
1/1/1/1:0,45:45:56 1/1/1/1:0,66:66:82 1/1/1/1:0,61:61:76 1/1/1/1:0,45:45:56 1/1/1/1:0,198:198:99
0/0/0/1:9,3:12:7 0/1/1/1:10,25:35:14 0/0/0/1:18,3:21:23 0/0/1/1:22,18:40:15 ./././.:.:.:. 0/0/0/1:26,2:28:40 ./././.:.:.:. 0/0/1/1:21,30:51:10 1/1/1/1:0,42:42:52 ./././.:.:.:.
I tried to look for any option to set ploidy but I could not find any answer.
Many thanks for any suggestion you might give me.
Alice
Your suspicion is correct, plink does not directly support tetraploid data. You will probably need to use other software packages to help analyze it.