Hello everyone, I want to plot AF distribution for all my vcf files. Is there any software to do that. I tried using vcfstats but its showing error in my vcf files.
Hello everyone, I want to plot AF distribution for all my vcf files. Is there any software to do that. I tried using vcfstats but its showing error in my vcf files.
Here is a Python API solution using the pyvcf.VcfFrame.plot_hist
method I wrote.
Below is a simple example:
from fuc import common, pyvcf
common.load_dataset('pyvcf')
vcf_file = '~/fuc-data/pyvcf/normal-tumor.vcf'
vf = pyvcf.VcfFrame.from_file(vcf_file)
vf.plot_hist('DP')
We can draw multiple histograms with hue mapping:
annot_file = '~/fuc-data/pyvcf/normal-tumor-annot.tsv'
af = common.AnnFrame.from_file(annot_file, sample_col='Sample')
vf.plot_hist('DP', af=af, group_col='Tissue')
We can show AF instead of DP:
vf.plot_hist('AF')
Hello, I am using the command to plot AF distribution for vcf files. But I am getting some error. The command:
#!/bin/python
from fuc import common, pyvcf
common.load_dataset('pyvcf')
vcf_file = 'GEUVADIS.chr22.genotype.vcf'
vf = pyvcf.VcfFrame.from_file(vcf_file)
vf.plot_hist('AF')
the error:
file.python:5: DtypeWarning: Columns (5) have mixed types.Specify dtype option on import or set low_memory=False.
vf = pyvcf.VcfFrame.from_file(vcf_file)
Traceback (most recent call last):
File "file.python", line 6, in
vf.plot_hist('AF')
File "/home/anaconda3/envs/var/lib/python3.7/site-packages/fuc/api/pyvcf.py", line 1991, in plot_hist
df = self.extract(k, as_nan=True, func=d[k])
File "/home/anaconda3/envs/var/lib/python3.7/site-packages/fuc/api/pyvcf.py", line 3912, in extract
df = self.df.apply(one_row, axis=1)
File "/home/anaconda3/envs/var/lib/python3.7/site-packages/pandas/core/frame.py", line 8736, in apply
return op.apply()
File "/home/anaconda3/envs/var/lib/python3.7/site-packages/pandas/core/apply.py", line 688, in apply
return self.apply_standard()
File "/home/anaconda3/envs/var/lib/python3.7/site-packages/pandas/core/apply.py", line 805, in apply_standard
results, res_index = self.apply_series_generator()
File "/home/anaconda3/envs/var/lib/python3.7/site-packages/pandas/core/apply.py", line 821, in apply_series_generator
results[i] = self.f(v)
File "/home//anaconda3/envs/var/lib/python3.7/site-packages/fuc/api/pyvcf.py", line 3901, in one_row
i = r.FORMAT.split(':').index(k)
ValueError: 'AF' is not in list
I responded to your GitHub issue here: https://github.com/sbslee/fuc/issues/39
I want to plot AF distribution for all my vcf files. Is there any software to do that.
bcftools stats --af-bins 0.1,0.2,0.3,0.5,1.0 input.vcf
followed by
plot-vcfstats
http://samtools.github.io/bcftools/bcftools.html#plot-vcfstats
Take a look at this: plot-VCF
It allows you to graphically visualize any flag (and much more) from your VCF file
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
I got AF from vcf files using vcftools software. Now I am trying to plot these values using the command:
The thing is that in my case I have two frequency value being available at one position.
Hi, what is the output of
str(var_qual)
?I would use
bcftools query
to output the AF values, and then import these to R where I would plot them viaplot()
,density()
, and/orhist()
This is the output of str(var_qual)