Entering edit mode
22 months ago
Eliza
▴
40
Hi, I have a CSV file that contains the columns: CHROM, POS,REF, and ALT
I want to convert this CSV file to a VCF file to upload to CADD:https://cadd.gs.washington.edu/score and,Ensembl https://www.ensembl.org/Tools/VEP to get SNP annotations. this is my code in python spyder to convert :
import csv
import gzip
# Open the CSV file
with open('C:/Users/agns1/Downloads/genetics/data/df_vcf.csv', 'r') as csvfile:
reader = csv.reader(csvfile)
# Skip the header row
next(reader)
# Open the VCF file for writing
with open('data.vcf', 'w') as vcffile:
# Write the VCF file header
vcffile.write('##fileformat=VCFv4.2\n')
vcffile.write('#CHROM\tPOS\tREF\tALT\n')
# Iterate through the CSV rows
for row in reader:
# Write the VCF data
vcffile.write(row[0] + '\t' + row[1] + '\t' + row[2] + '\t' + row[3] + '\n')
the file that i get looks like this and has name data.vcf :
BUT Ensembl and CADD "say" the the format of the file is not correct, and i dont understand why since it is a VCF file (or how to fix this problem ) thank you:)
can i add this column and they would be just empty? in the CADD websote it says :"It is sufficient to provide the first 5 columns of a VCF file without header, as all other information than CHROM, POS, REF, ALT will be ignored anyway. "
so ID is missing....
got this error from ensmble when I loaded the corrected file:"exiting the program. The input file appears to be unsorted. Please sort by chromosome and by location and re-submit." should I order the data by ascending Chrom and inside the chrom by acs POS
Hi Eliza - yes, that is correct.