I have a big VCF file that I need to convert to, preferably, bed/bim/fam files that are readable by plink. Currently using plink 1.9. I am aware that this version of plink can be used to convert VCFs into binary peds using
plink --vcf file.vcf.gz --make-bed --out out
Or something similar. I also know that in this case, plink will automatically fill the phenotype and sex columns in the bed and fam files with zeros. I know there is a --pheno flag that allows specifying a txt file for the phenotypes and a --make-pheno flag that can be used for the same effect.
How exactly does this work? How can I encode the phenotype and sex information correctly? Can I just use --make-bed and then manually make a phenotype/sex file to use down the line with the --pheno flag? In this case, will there not be a problem with the fact that the generated .bed in the first step does not have this information? What about with sex? Another option I can think of is just edit the generated .fam to include the correct information and use it downstream, but that information will still not be in the .bed file.
I have no family relationships, so the FID fields would just be 0s (I guess I can add the --const-fid flag for that effect when I'm converting, right?).
I am a bit confused and haven't been able to find threads that match my questions.
Hello mfshiller!
It appears that your post has been cross-posted to another site: https://bioinformatics.stackexchange.com/questions/15385/
This is typically not recommended as it runs the risk of annoying people in both communities.