Entering edit mode
5.5 years ago
mostafarafiepour
▴
180
Hi Dear,
I want to calculate Fis (inbreeding coefficient) for each population (I have a VCF file for each population and my data is whol genome sequencing). After my online search, I came to the conclusion that I have to use the following code to calculate Fis:
vcftools --vcf my_vcf --het --out my_output
And my output is as follows:
INDV O(HOM) E(HOM) N_SITES F
EAZ01 5009213 5987534.2 9479847 -0.28014
EAZ02 5690439 6542872.4 10334833 -0.22480
EAZ03 5672175 6446478.9 10155665 -0.20875
EAZ04 5680677 6481885.8 10244851 -0.21292
EAZ05 5601825 6167853.4 9728104 -0.15899
EAZ06 8831054 7710942.2 12101973 0.25509
EAZ07 7927379 8014968.5 12592949 -0.01913
EAZ08 8274126 8116174.2 12752003 0.03407
EAZ09 7614217 7765762.1 12207482 -0.03412
EAZ10 7875422 7815149.9 12282681 0.01349
So, My question is: First, which column is the Fis value? Secondly, How can I have Fis for the population (not individuals)? For example, Fis 0.25483 for the population of one, two and ...?
Best Regard
Mostafa
Where did you hear the term
Fis
? I know ofF
values, but notFis
.plink
also has a way to compute inbreeding coefficients, but again, they are done at the individual level. What do you mean by "Fis for population"?Hi Ram, Many thanks for your reply,
please see here (Is the heterozygosity flag (--het) in vcftools calculate observed and expected heterozygosity?) for your first question.
I mean by "Fis for population": You said that (plink also has a way to compute inbreeding coefficients, but again, they are done at the individual level), Right? So, my question is, can we consider the average inbreeding coefficient of individuals as population inbreeding coefficient?
My first question ("Where did you hear the term
Fis
") was not answered in that link. Nowhere can you find the string "Fis" in that page.I guess you can, as long as you mention that to everyone that encounters the dataset and share the individual level inbreeding coefficients. This is a metric you'd be assuming validity for, so be open to hearing criticisms on it.
Try change the way you named the data set when creating the vcf. I hope you ran denovo_map.pl or ref map and then population programs. If so you have to change the popmap input. Additional information is in stacks denovo_map.pl manual
How was your input vcf formatted? Specifically, did you have multiple columns for each individual? And where did the column names come from?
vcftools --het
calculates expected heterozygosity, but O(HOM) sounds like homozygosity to me.Hi, To get the group values you have to change the popmap if it's stacks.
OP has not mentioned that they're using stacks - they only mention VCFtools. Given that, I'm moving your answer to a comment.
I'm not sure whether this belongs as an answer or not since your requirements aren't clear, but I'll leave it as a comment for now.
Remember what you're doing with Fis - measuring the genetic variance in a 'S'ubpopulation that's present in the 'I'ndividual. In this sense, I'm not sure what you mean by an Fis for the population; perhaps you're looking for Fst ('S'ubpopulation relative to the 'T'otal)? Is an average of Fis estimates something that's commonly reported?
Since
VCFtools
isn't clear, I'd recommend usingpopStats
in vcflib, which explicitly outputs Fis.vcflib
also implements Weir and Cockerham's Fst (wcFst).