How to convert vcf file to frequency file for sweepfinder2
1
0
Entering edit mode
5.6 years ago
1771012116 • 0

Hi, I am analysing genome wide scan for selective sweeps using SF2, but some problems blocked me. I appreciate if you could help me with the file conversion from VCF to desired allele frequency file. I have converted VCF file to frequency file in vcftools, but the file format generated with vcftools is not the same as expected, below is the few lines.

CHROM   POS     N_ALLELES       N_CHR   {ALLELE:FREQ}
chr1H   43870   2       46      C:0.891304      T:0.108696
chr1H   43895   2       44      A:0.909091      T:0.0909091
chr1H   43937   2       48      A:0.895833      G:0.104167
chr1H   43944   2       48      G:0.708333      T:0.291667
chr1H   43948   2       48      T:0.854167      A:0.145833
chr1H   44011   2       46      A:0.891304      G:0.108696

And below is the desired allele frequency file.

position x n folded 
460000 9 100 0 
460010 100 100 0 
460210 30 78 1 
463000 0 94 0

The first column is the position on the chromosome, the second column is the allele count ( ), the third column is the sample size ( ), and the fourth column is an indicator as to whether the site has been polarized (i.e., whether it is known that the allele is derived or ancestral)

Thank you in advance.

software error • 2.2k views
ADD COMMENT
0
Entering edit mode

Hi, have you been able to figure this out? I am also dealing with a similar issue.Thanks!

ADD REPLY
0
Entering edit mode
3.1 years ago
Kristian • 0

Should be possible with vcftools, if the VCF file is reduced to biallelic sites and polarized to ancestral state as REF-allele and has the AA flag:

vcftools --derived will have 5 columns:

CHR, POS, N_ALLELES, N_CHR, COUNT_ANCESTRAL, COUNT_DERIVED

since SweepFinder2 expects, one need to change the output e.g with awk:

POS, COUNT_DERIVED, N_CHR, FOLDED

FOLDED will be here always put to 0, since ancestral allele was defined

vcftools --counts2 --derived --gzvcf VCF.gz --stdout | awk 'NR<=1 {next} {print $2"\t"$6"\t"$4"\t0"}' > SF2.input
ADD COMMENT

Login before adding your answer.

Traffic: 1811 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6