Entering edit mode
6.0 years ago
LZhang
•
0
I got SNV file, and need to transfer to Plink data format. This file is not like SNP file with single mutation.
SNV SNP_ID Ref_Allele Alt_Allele Chrs Position S1 S2 S3 S4 ...
SNV00500 rs3805454 G A 5 170241168 G/G G/A A/A G/G
SNV00501 rs3833483 GTGTGCGTGC - 6 36076345-36076354 GTGTGCGTGC/- GTGTGCGTGC/- GTGTGCGTGC/GTGTGCGTGC GTGTGCGTGC/GTGTGCGTGC
SNV00502 rs61236346 - ATT 4 185549857-185549857 -/ATT -/- -/- -/ATT
SNV00503 rs145823343 - T 3 132420504-132420504 -/T -/- -/- -/T
SNV00504 rs8175 A G 4 114682566 A/A A/A A/A A/G
SNV00505 rs11354576 C - 7 30792057-30792057 C/- C/C C/- C/C
...
As you can file, some of Position is not one, and Ref Allele is not single, also there are "-" in Allele, how can I transfer to Plink ? Should I transfer these multiple letters to single one, e.g for SNV00501, should I transfer "GTGTGCGTGC" to "I", and "-" to "D", then also update Samples info in S1-S4 ...? How to deal with Position location to plink map file as well?
Thanks for help. Any suggestion would be appreciated.
Best
Doesn't look like a standard format at all. I reckon you'll have to try a custom python script or similar. Happy hacking and good luck.