Question

Stacks' populations.sumstats.tsv : interpreting pi for populations with one individual

0

Entering edit mode

21 months ago

SB • 0

I used Stacks::populations to generate summary statistics for my study. I sampled ~250 plants from ~100 sampling sites, and between 1-5 individuals per sampling site. DNA was sequenced with genotype-by-sequencing, a reduced representation method giving ~1% of the genome and about 2000 SNPs at ~400x coverage. I'm interested in understanding how sampling one individual per sampling site impacts these summary statistics.

One question I have is how pi is calculated for those sites. Here is a snipped from the populations.sumstats.tsv file, which shows "Summary statistics for each population" (for me, Stacks' "population" = sampling site). I'll refer to sampling sites as populations going forward.

snippet from Stacks::populations populations.sumstats.tsv file

here is a link to the table key: https://catchenlab.life.illinois.edu/stacks/manual/#pfiles

My understanding is that each row represents a single SNP from a single population. For EACH population, there should be only one row. Rows with only one individual per population should have a pi = 0, because there's only one individual with which it can compare the same SNP. Populations with at least 2 individuals, however, can have pi > 0 because it's possible that SNP could be polymorphic.

Yet sometimes there are rows where populations with one individual have pi > 0. How can this be?

Thanks!

summary-statistics Stacks pi • 932 views

ADD COMMENT • link updated 21 months ago by Ram 45k • written 21 months ago by SB • 0

score 0 · Answer 1 · 2023-08-11

0

Entering edit mode

21 months ago

SB • 0

Oh wow, this could be embarrassing... is it because this species is diploid??

ADD COMMENT • link 21 months ago by SB • 0