Entering edit mode
14 months ago
Ben
▴
10
I used the following command to generate VCF files from the 23andMe zip file for the Michigan Imputation server but it keeps failing with the validation error: At least 20 samples must be uploaded.
java \
-jar vcf-tools-0.1.jar \
vcf-generator \
--in 23_n_me/23andme-tools-output/genome_name_v4_Full_20230822212500.zip \
--ref human_g1k_v37.fasta \
--out 23_n_me/23andme-tools-output/For_Imputation/ \
--exclude Y,MT
The command spits out VCF.gz file for each chromosome which looks something like the following.
##fileformat=VCFv4.2
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=1,length=249250621>
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT genome_name_v4_Full_20230822212500
1 734462 rs12564807 G A . . . GT 1/1
1 752721 rs3131972 A G . . . GT 1/1
1 838555 rs4970383 C A . . . GT 1/1
1 861808 rs13302982 A G . . . GT 0/1
1 873558 rs1110052 G T . . . GT 0/1
1 889159 rs13302945 A C . . . GT 1/1
1 891945 rs13303106 A G . . . GT 1/1
1 894573 rs13303010 G A . . . GT 1/1
1 909238 i6060381 G C . . . GT 1/1
1 918384 rs13303118 G T . . . GT 1/1
1 924898 rs6665000 C A . . . GT 1/1
1 927309 rs2341362 T C . . . GT 1/1
1 928836 rs9777703 C T . . . GT 1/1
1 948692 rs2341365 G A . . . GT 1/1
Would you know what might be wrong or if I am missing something? Is the output VCF format wrong?