Entering edit mode
3.8 years ago
hemr3
▴
10
Sorry if this is a repeat question - but all the answers I could find didn't help.
This is the error:
ERROR: No nonmissing markers for individuals Ind1 Ind1 - Ind2 Ind2
Here are the steps I have done:
Convert .vcf to plink format (ped, map) and then to binary format (bim bam bed):
vcftools --vcf data.vcf --plink --out data
plink --file data.vcf --noweb --make-bed --recode --missing-genotype 0 --out data
This is where the error occurs:
plink1 --bfile data.vcf --cluster --noweb
I have tried to make another .clust file using:
bcftools query -l data.vcf.gz | awk '{split($1,pop,"."); print $1"\t"$1"\t"pop[2]}' > data.clust
Which looks like this:
Ind1 Ind1
Ind2 Ind2
Ind3 Ind3
But using this file results in the error:
Reading clusters from [ data.clust ]
Ind1 Ind1
ERROR: Problem reading from [ data.vcf.clust ]
I am preparing vcf files for TreeMix analysis, and these genome files do have a high-degree of missingness, if that makes a difference.
The original .vcf file looks like this:
##FORMAT=<ID=GT,Number=1,Type=String,Description="Unphased Genotype">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT Deni4 Mez1
22 17049382 . C T . PASS . GT ./. ./.
I'd be very grateful for any help!
plink --file data.vcf ...
won't work aftervcftools --vcf data.vcf --plink --out data
. You ran something slightly different, and for any troubleshooting question it is often critical for you to post EXACTLY what you ran.vcftools --plink
is less reliable, much slower, and throws away more information than plink 1.9 --vcf.Thank you for your help! But I did run those commands, and they did work - but maybe they worked in a sort of janky way that made the files difficult/impossible to deal with. I'll update plink and try the --mind flag!
If it worked, it would be because the second command read DIFFERENT (preexisting) files than what your first command generated.