Allele frequency calculation
1
0
Entering edit mode
3.4 years ago
rheab1230 ▴ 140

Hello everyone, I use vcf tools to find AF values by using this command:

vcftools --gzvcf $SUBSET_VCF --freq2 --out $OUT --max-alleles 2

The output I got from this is:

chr      pos nalleles  nchr    a1     a2    
  <dbl>    <dbl>    <dbl> <dbl> <dbl>  <dbl>  
1    22 16050408        2   846 0.944 0.0556 
2    22 16050612        2   846 0.937 0.0626 
3    22 16050678        2   846 0.948 0.0520 
4    22 16050984        2   846 1     0      
5    22 16051107        2   846 0.943 0.0567 
6    22 16051249        2   846 0.927 0.0733 

Now I know in order to find minor allele frequency (MAF): i use this command: find minor allele frequency

var_freq$maf <- var_freq %>% select(a1, a2) %>% apply(1, function(z) min(z))

But I need to plot total allele frequency and not MAF. Should i consider both a1 and a2 value for plotting?

AF VCF snp calc • 2.0k views
ADD COMMENT
0
Entering edit mode

"Total allele frequency" will always be 1. Do you want to plot the frequency of just the ALT allele, just the minor allele, or of all alleles?

ADD REPLY
0
Entering edit mode

I have plotted one for minor allele. My supervisor told me to plot allele frequency distribution for the vcf files. He didn't mentioned anything specific.

ADD REPLY
0
Entering edit mode

Your question should be addressed to your supervisor, then - it does not belong on a public forum.

ADD REPLY
0
Entering edit mode

I will contact my supervisor regarding this. Suppose I want to plot ALT allele frequency, how should I do that?

ADD REPLY
0
Entering edit mode

Understand which column contains the data you need, how you need to plot it and then build the code that generates that plot. You will need to do a whole bunch of google searches.

ADD REPLY
0
Entering edit mode
3.4 years ago
sbstevenlee ▴ 480

If you don't mind Python API, have a look at the pyvcf.VcfFrame.plot_region method I wrote:

Below is a simple example:

from fuc import pyvcf, common
import matplotlib.pyplot as plt
common.load_dataset('pyvcf')
vcf_file = '~/fuc-data/pyvcf/getrm-cyp2d6-vdr.vcf'
vf = pyvcf.VcfFrame.from_file(vcf_file)
vf.plot_region('NA18973')
plt.tight_layout()

enter image description here

We can display allele fraction of REF and ALT instead of DP:

ax = vf.plot_region('NA18973', k='#AD_FRAC_REF', label='REF')
vf.plot_region('NA18973', k='#AD_FRAC_ALT', label='ALT', ax=ax)
plt.tight_layout()

enter image description here

Note that the method is only available in the 0.21.0-dev version of the fuc package. If you need help installing this development version, please let me know in the comment.

ADD COMMENT

Login before adding your answer.

Traffic: 1951 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6