How to convert CSV format to VCF using Python?
0
0
Entering edit mode
7 weeks ago
Mohadese • 0

I have a CSV file like the following and I would like to convert it to VCF. How Can I do it using Python?

Sample Name,rsID,Chr,Position,Allele1 - Plus,Allele2 - Plus,genotype
2,1:103380393,1,103380393,G,G,GG
2,1:106737318,1,106737318,T,T,TT
2,1:109439680,1,109439680,A,A,AA
2,1:110228436_CNV_GSTM1,1,110228436,-,-,--
2,1:110228505_CNV_GSTM1,1,110228505,C,C,CC
2,1:110228615_CNV_GSTM1,1,110228615,T,T,TT
2,1:110228695_CNV_GSTM1,1,110228695,G,G,GG
2,1:110229315_CNV_GSTM1,1,110229315,G,G,GG
VCF CSV Python • 355 views
ADD COMMENT
0
Entering edit mode

You do not have enough columns to transform your CSV into VCF ver 4.2. There are just CHROM POS ID REF ALT end even that is being generous. In your example the species (human) and the genome assembly version are not specified. Alelle1 being the same as Alelle2 makes no sense if one thinks about Alelle1 as REF. So whatever you do downstream you need to obtain the REF from the proper genome assembly.

ADD REPLY
0
Entering edit mode

I also have a file containing following information

Index,Name,chromosome,position,GenTrain Score,SNP,ILMN Strand,Customer Strand,NormID,rsID
1,1:103380393,1,103380393,0.8921,[T/C],BOT,TOP,31,
2,1:106737318,1,106737318,0.8081,[A/C],TOP,BOT,31,
3,1:109439680,1,109439680,0.8745,[A/G],TOP,TOP,2,
4,1:110228436_CNV_GSTM1,1,110228436,0.5802,[A/G],TOP,BOT,8,
5,1:110228505_CNV_GSTM1,1,110228505,0.5154,[A/C],TOP,TOP,8,
6,1:110228615_CNV_GSTM1,1,110228615,0.8658,[T/C],BOT,BOT,8,
7,1:110228695_CNV_GSTM1,1,110228695,0.6796,[T/G],BOT,BOT,8,
8,1:110229315_CNV_GSTM1,1,110229315,0.7814,[T/G],BOT,BOT,8,
ADD REPLY
0
Entering edit mode

This looks like some result from Infinium Global Diversity Array. The best is to contact Illumina customer support. Check if GenomeStudio can output a meaningful,well formatted VCF.

ADD REPLY

Login before adding your answer.

Traffic: 1660 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6