Hi,
I have done variant calling on my file using samtools mpile up and I have converetd my file from bcf to vcf.
It looks like this
##fileformat=VCFv4.1
##samtoolsVersion=0.1.18 (r982:295)
##INFO=<ID=DP,Number=1,Type=Integer,Description="Raw read depth">
##INFO=<ID=DP4,Number=4,Type=Integer,Description="# high-quality ref-forward bases, ref-reverse, alt-forward and alt-reverse bases">
##INFO=<ID=MQ,Number=1,Type=Integer,Description="Root-mean-square mapping quality of covering reads">
##INFO=<ID=FQ,Number=1,Type=Float,Description="Phred probability of all samples being the same">
##INFO=<ID=AF1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele frequency (assuming HWE)">
##INFO=<ID=AC1,Number=1,Type=Float,Description="Max-likelihood estimate of the first ALT allele count (no HWE assumption)">
##INFO=<ID=G3,Number=3,Type=Float,Description="ML estimate of genotype frequencies">
##INFO=<ID=HWE,Number=1,Type=Float,Description="Chi^2 based HWE test P-value based on G3">
##INFO=<ID=CLR,Number=1,Type=Integer,Description="Log ratio of genotype likelihoods with and without the constraint">
##INFO=<ID=UGT,Number=1,Type=String,Description="The most probable unconstrained genotype configuration in the trio">
##INFO=<ID=CGT,Number=1,Type=String,Description="The most probable constrained genotype configuration in the trio">
##INFO=<ID=PV4,Number=4,Type=Float,Description="P-values for strand bias, baseQ bias, mapQ bias and tail distance bias">
##INFO=<ID=INDEL,Number=0,Type=Flag,Description="Indicates that the variant is an INDEL.">
##INFO=<ID=PC2,Number=2,Type=Integer,Description="Phred probability of the nonRef allele frequency in group1 samples being larger (,smaller) than in group2.">
##INFO=<ID=PCHI2,Number=1,Type=Float,Description="Posterior weighted chi^2 P-value for testing the association between group1 and group2 samples.">
##INFO=<ID=QCHI2,Number=1,Type=Integer,Description="Phred scaled PCHI2.">
##INFO=<ID=PR,Number=1,Type=Integer,Description="# permutations yielding a smaller PCHI2.">
##INFO=<ID=VDB,Number=1,Type=Float,Description="Variant Distance Bias">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##FORMAT=<ID=GQ,Number=1,Type=Integer,Description="Genotype Quality">
##FORMAT=<ID=GL,Number=3,Type=Float,Description="Likelihoods for RR,RA,AA genotypes (R=ref,A=alt)">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="# high-quality bases">
##FORMAT=<ID=SP,Number=1,Type=Integer,Description="Phred-scaled strand bias P-value">
##FORMAT=<ID=PL,Number=G,Type=Integer,Description="List of Phred-scaled genotype likelihoods">
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT /home/CleanData/Filtered_28_CTRL.sorted.bam
chr1 870903 . T C 7.8 . DP=1;AF1=1;AC1=2;DP4=0,0,1,0;MQ=37;FQ=-30 GT:PL:DP:GQ 1/1:37,3,0:1:4
chr1 886006 . T C 7.8 . DP=1;AF1=1;AC1=2;DP4=0,0,1,0;MQ=37;FQ=-30 GT:PL:DP:GQ 1/1:37,3,0:1:4
chr1 893280 . G A 7.8 . DP=1;AF1=1;AC1=2;DP4=0,0,1,0;MQ=37;FQ=-30 GT:PL:DP:GQ 1/1:37,3,0:1:4
chr1 981087 . A G 7.8 . DP=1;AF1=1;AC1=2;DP4=0,0,0,1;MQ=37;FQ=-30 GT:PL:DP:GQ 1/1:37,3,0:1:4
chr1 982462 . T C 7.8 . DP=1;AF1=1;AC1=2;DP4=0,0,1,0;MQ=37;FQ=-30 GT:PL:DP:GQ 1/1:37,3,0:1:4
chr1 982513 . T C 7.8 . DP=1;AF1=1;AC1=2;DP4=0,0,1,0;MQ=37;FQ=-30 GT:PL:DP:GQ 1/1:37,3,0:1:4
chr1 1162326 . A G 13.9 . DP=2;VDB=0.0340;AF1=1;AC1=2;DP4=0,0,1,1;MQ=37;FQ=-33 GT:PL:DP:GQ 1/1:45,6,0:2:10
My Question is how can I annotate this file? and know about my snps and carry further analysis like using SIFT or pol[hen.
Any guidiance is welcome, thank you for your time.