I created .ped and .map pairs of each individual from 23andMe raw text format by using plink. As far as I know,
plink --bfile binary_fileset --recode 12 --out new_text_fileset
--recode 12
command creates a genomic 0,1,2 matrix (If reference allele is A and genotype is AA, it gives 2, if genotype is AT it gives 1, TT gives 0) from .bed, .bim and .fam files. However, we do not know which allele is reference for each snp. Does plink defines the reference allele instead of us or do we have to pass on external reference allele data to plink? If so, how do we integrate reference allele data to plink .ped and .map file? 23andMe txt file only contains genotype pair for each snp.
Example 23andMe raw data format:
# rsid chromosome position genotype
rs4477212 1 82154 AA
rs3094315 1 752566 AA
rs3131972 1 752721 GG
What is the usage of
--a2-allele
flag? What is input file and output file format? Could you post an example usage? I have .bim, .fam and .bed fileset of my 23andMe individuals now, how do input these to plink--a2-allele
?