Entering edit mode
5.9 years ago
tinyskinn
•
0
Hi everyone,
If I have a vcf file with header like this:
##fileformat=VCFv4.0
##fileDate=20090805
##source=myImputationProgramV3.1
##reference=1000GenomesPilot-NCBI36
##phasing=partial
##INFO=<ID=NS,Number=1,Type=Integer,Description="Number of Samples With Data">
##INFO=<ID=DP,Number=1,Type=Integer,Description="Total Depth">
##INFO=<ID=AF,Number=.,Type=Float,Description="Allele Frequency">
##INFO=<ID=AA,Number=1,Type=String,Description="Ancestral Allele">
##INFO=<ID=DB,Number=0,Type=Flag,Description="dbSNP membership, build 129">
##INFO=<ID=H2,Number=0,Type=Flag,Description="HapMap2 membership">
##FILTER=<ID=q10,Description="Quality below 10">
##FILTER=<ID=s50,Description="Less than 50% of samples have data">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Read Depth">
##FORMAT=<ID=HQ,Number=2,Type=Integer,Description="Haplotype Quality">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NA0000
How can I extract (with bcftools) and save/print these key-value pairs to a csv csv file with python. My expected output would look something like:
fileformat,VCFv4.0
fileDate,20090805
etc...
Thanks for the anticipated response
Hello tinyskinn ,
what is your final goal? When working with python on vcf files, you should take one of the existing modules like pysam or cyvcf2.
fin swimmer
Hello Fin, my final goal is to be able to have an annotated metadata as in enter link description here. And this would require me to automate the extraction of the header Key-Value info. I am trying to implement this in enter link description here. And thank you for the suggestions.
I don't understand how your expected output is comma separated for the first line, but separated by a
=
for the second line.Also, what have you tried to solve this issue? Why do you want to use bcftools and python, exactly? Mentioning these suggest that you at least have an idea which tools/language can be used.
regarding your first question, I have corrected the error. Thanks! to your second question, I actually do not have to use bcftools or python. From my search, they were the ones that came up. And my goal is explained my above answer to @finswimmer