Dear All
I working on SNP genotyped data of Bovine 50K beadchip, as all know the ped file format is as below:
FAM001 1 0 0 1 2 A A G G A C
I want to have this file:
FAM001 1 0 0 1 2 A G A
FAM001 1 0 0 1 2 A G C
I mean I want to have SNP genotype in one column. Is there any command in plink for making this file? I will appreciate if some one help me
It would help if you could tell us why do you need it in this format.
Hi
I want to make a input file of sweep v1.1 software. Sweep accepts a standard format of genotype data, fully phased with missing data filled in. it should be two file 1. Genotype data file and 2. SNP data file. Genotype data file contain
Column 1: the individual identifier. - Column 2: the chromosome identifier. For autosomes you should have two chromosomes per individual. We can label the two chromosomes T for transmitted and U for untransmitted, (but it can be anything eg. A and B.) - Columns 3 – N: each column gives the allele for one SNP in the order of its position on the chromosome. The alleles are represented as A=1, C=2, G=3, T=4. same as below
1331-1331FF12 T 1 3 3 2
1331-1331FF12 U 1 1 1 2
1331-1331FM13 T 1 3 3 2
1331-1331FM13 U 1 3 3 4
and The SNP data file has 3 tab-delimited columns, which gives information about the markers you genotyped. and file contain :
Column 1: The SNP identifier. This can be an rs number or any other name you choose to give. - Column 2: The chromosome. - Column 3: The SNP position based on the build identified. UMD3.1 are currently recognized as below:
snpid chr HG16
rs267265 3 45548733
rs267262 3 45567119
rs267241 3 45578901
thanks for your attention
Thanks for explaining what you need the data for, but I would caution you that your data are most likely not fully or even partially phased. Array data are not phased by haplotype, and if you require phasing you need to first apply a method that attempts to phase your genotypes by haplotype.
Hi Matt thanks for your guide, actually I am new in haplotype phasing, I know I can use fastPHASE or PHASE for haplotype phasing (Stephens, Smith et al. 2001). I did it in Linux but i don't have enough memory for haplotype reconstruction of whole chromosome. and also when I reconstruct partial segment it didn't give me sweep input format, so I think, may be i can use plink. I will be appreciate if you help me for haplotype reconstruction.
I don't think I can help you much with the actual work, but just wanted to make sure you weren't expecting that your genotypes were already phased.
Have a look at SHAPEIT, it is "multi-threaded to tailor computational times to your resources."