minor allele frequency (MAF) from vcf
2
1
Entering edit mode
8.7 years ago
panbar ▴ 20

How minor allele frequency (MAF) is calculated from the DP4 fields of vcf file? Can anyone help with a unix shell script?

SNP sequencing next-gen genome • 7.7k views
ADD COMMENT
2
Entering edit mode
ADD REPLY
0
Entering edit mode

Its a problematic situation.

ADD REPLY
0
Entering edit mode

Testing the leveling comment.

ADD REPLY
0
Entering edit mode

That's cool man..! Its working fine.

ADD REPLY
1
Entering edit mode
8.5 years ago
panbar ▴ 20

It is easy to do with awk. I found an easy way as follows. for single input snv vcf file

awk '$1=="#CHROM" {print $0 "\tMAF"; next}; NF { info=$8; gsub(/.*;DP4=|;MQ=.*/, "", info); split(info, a, /,/); print $0 "\t" (a[3]+a[4])/(a[1]+a[2]+a[3]+a[4])}' inputfile.vcf > outputfile.vcf

for multiple input snv vcf files

awk '$1=="#CHROM" {print $0 "\tMAF" > FILENAME".MAF.vcf"; next}; NF { info=$8; gsub(/.*;DP4=|;MQ=.*/, "", info); split(info, a, /,/); print $0 "\t" (a[3]+a[4])/(a[1]+a[2]+a[3]+a[4]) > FILENAME".MAF.vcf"}' inputfile1.vcf inputfile2.vcf

ADD COMMENT
0
Entering edit mode
8.7 years ago

The SNiPlay online pipeline implements VCFtools that calculates MAF from a VCF file: http://sniplay.southgreen.fr/cgi-bin/analysis_v3.cgi

ADD COMMENT

Login before adding your answer.

Traffic: 1133 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6