Hello everyone, I use vcf tools to find AF values by using this command:
vcftools --gzvcf $SUBSET_VCF --freq2 --out $OUT --max-alleles 2
The output I got from this is:
chr pos nalleles nchr a1 a2
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 22 16050408 2 846 0.944 0.0556
2 22 16050612 2 846 0.937 0.0626
3 22 16050678 2 846 0.948 0.0520
4 22 16050984 2 846 1 0
5 22 16051107 2 846 0.943 0.0567
6 22 16051249 2 846 0.927 0.0733
Now I know in order to find minor allele frequency (MAF): i use this command: find minor allele frequency
var_freq$maf <- var_freq %>% select(a1, a2) %>% apply(1, function(z) min(z))
But I need to plot total allele frequency and not MAF. Should i consider both a1 and a2 value for plotting?
"Total allele frequency" will always be 1. Do you want to plot the frequency of just the ALT allele, just the minor allele, or of all alleles?
I have plotted one for minor allele. My supervisor told me to plot allele frequency distribution for the vcf files. He didn't mentioned anything specific.
Your question should be addressed to your supervisor, then - it does not belong on a public forum.
I will contact my supervisor regarding this. Suppose I want to plot ALT allele frequency, how should I do that?
Understand which column contains the data you need, how you need to plot it and then build the code that generates that plot. You will need to do a whole bunch of google searches.