Entering edit mode
9.9 years ago
analyticalavailable
▴
30
I'm trying to extract the alleles for each of the SNPs in a dataset I have, using the following plink2 command.
./plink2 --bfile chr1 --recode --extract snp.dat --out snp --noweb
I find that instead of two alleles for each subject, I have more than one e.g. "TC".
75 VS9 0 0 1 -9 T T
76 VS7 0 0 2 -9 TC T
77 VSS 0 0 1 -9 T T
78 VSJ 0 0 1 -9 T T
79 VSD 0 0 2 -9 T T
80 VSG 0 0 2 -9 T T
81 VRY 0 0 2 -9 T T
83 VTI 0 0 2 -9 T T
84 VS4 0 0 2 -9 T T
85 VUR 0 0 2 -9 T T
86 VUT 0 0 2 -9 T T
87 VUQ 0 0 2 -9 TC T
88 VUP 0 0 2 -9 T T
Can anyone tell me what is going on here?
insertion ?
Look at dbSNP to be sure.
Can you explain what insertion means in this case? I'm computer science, and new to this. Cheers.
In the reference population, you normally have a T at this position. But in some individuals, a C is added between the T and the next allele. So at the end you have one extra base.
http://en.wikipedia.org/wiki/Insertion_%28genetics%29
Thanks.
When performing a GWAS, how are such insertions usually treated?
I'm not sure, but I guess you could use it as a normal SNP.