Hi everyone,
I have received separate VCF files for multiple samples. I merged each sample into a single VCF file for each chromosome, and then sorted the resulting files using:
bcftools merge -l ID_list.txt -r $chrom -O z -o ${chrom}_merged.vcf.gz
bcftools sort ${chrom}_merged.vcf.gz -O z -o ${output}/${chrom}_sorted.vcf.gz
I've pasted a subset of the merged file (before filtering) for two samples below:
chr30 3962 . AC A . PASS . GT:RC:AC:GP:DS 0/0:0:0:1,1e-10,1e-10:3e-10 0/0:0:0:1,1e-10,1e-10:3e-10
chr30 4015 . TG T . LOWCONF . GT:RC:AC:GP:DS 0/0:0:0:1,1e-10,1e-10:3e-10 0/0:0:0:1,1e-10,1e-10:3e-10
chr30 4026 . T G . PASS . GT:RC:AC:GP:DS 0/0:10:0:1,1e-10,1e-10:3e-10 0/0:2:0:1,1e-10,1e-10:3e-10
chr30 4034 . T G . PASS . GT:RC:AC:GP:DS 0/0:9:0:1,1e-10,1e-10:3e-10 0/0:3:0:1,1e-10,1e-10:3e-10
chr30 4070 . G A . PASS . GT:RC:AC:GP:DS 0/0:11:0:1,1e-10,1e-10:3e-10 0/0:3:0:1,1e-10,1e-10:3e-10
chr30 4081 . A C . PASS . GT:RC:AC:GP:DS 0/0:8:0:1,1e-10,1e-10:3e-10 0/0:3:0:1,1e-10,1e-10:3e-10
I understand that RC and AC are the allele counts for the REF and ALT alleles, respectively.
My questions are:
1) How do I interpret the RC/AC fields for a single individual? E.g. what does it mean for one sample to have an RC of 11?
2) Have I used an appropriate method of merging these files?
3) Is it possible to calculate MAF for each SNP using this file format?
Any help would be much appreciated!