Unexpected break in site frequency spectrum pattern
1
0
Entering edit mode
18 months ago
S ▴ 10

Site frequency spectra with expected changes in patterns EXCEPT FOR when R = 0.3

Hi there,

I've been generating lots of Stacks::populations outputs with varying -R and -min_maf values (see definitions below) in an effort to understand how changing these variables changes the site frequency spectrum. I'm hoping that understanding how many rare variants are being removed from each scenario will help me decide which combination of -R and -min_maf values to choose for a demographic analysis, which necessitates the inclusion of rare (but real) variants.

-R: minimum percentage of individuals across populations required to process a locus

-min_maf: a minimum minor allele frequency required to process a nucleotide site at a locus (0 < min_maf < 0.5; applied to the metapopulation)

I've generated 60 plots for each of 5 populations using the vcfs and vcf2sfs R package. The min_maf values range from ~0.01 to 0.05 and -R from 0.1 to 1. Generally, when min_maf is ~0.01-0.04, the plots look as expected: the number of variants decreases as -R increases. But when min_maf = 0.05, I see an unexpectedly sharp decrease in variants at R = 0.3. The adjacent plots look fine, though. This happens for each population.

What might've happened here? Any ideas would be appreciated. Thanks!

frequency sfs Stacks site spectrum populations • 532 views
ADD COMMENT
1
Entering edit mode
18 months ago
S ▴ 10

I regenerated the vcf with Stacks and it was much larger than the first time. Must not have been created correctly. Now the plots look as expected too. Hooray for solving one's own problems!

ADD COMMENT

Login before adding your answer.

Traffic: 2069 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6