Stacks' populations.sumstats.tsv : interpreting pi for populations with one individual
1
0
Entering edit mode
21 months ago
SB • 0

I used Stacks::populations to generate summary statistics for my study. I sampled ~250 plants from ~100 sampling sites, and between 1-5 individuals per sampling site. DNA was sequenced with genotype-by-sequencing, a reduced representation method giving ~1% of the genome and about 2000 SNPs at ~400x coverage. I'm interested in understanding how sampling one individual per sampling site impacts these summary statistics.

One question I have is how pi is calculated for those sites. Here is a snipped from the populations.sumstats.tsv file, which shows "Summary statistics for each population" (for me, Stacks' "population" = sampling site). I'll refer to sampling sites as populations going forward.

snippet from Stacks::populations populations.sumstats.tsv file

here is a link to the table key: https://catchenlab.life.illinois.edu/stacks/manual/#pfiles

My understanding is that each row represents a single SNP from a single population. For EACH population, there should be only one row. Rows with only one individual per population should have a pi = 0, because there's only one individual with which it can compare the same SNP. Populations with at least 2 individuals, however, can have pi > 0 because it's possible that SNP could be polymorphic.

Yet sometimes there are rows where populations with one individual have pi > 0. How can this be?

Thanks!

summary-statistics Stacks pi • 932 views
ADD COMMENT
0
Entering edit mode
21 months ago
SB • 0

Oh wow, this could be embarrassing... is it because this species is diploid??

ADD COMMENT

Login before adding your answer.

Traffic: 2745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6