Weir And Cockerham Fst from PLINK(bed/bim/fam) input files
1
1
Entering edit mode
10.4 years ago
kautilya ▴ 430

I want to calculate markerwise Weir And Cockerham Fst for certain differentiating markers. I have my data in PLINK(bed/bim/fam) format. I have scourged the internet for packages to calculate the fst but very few seem to support native plink format directly as an input.

There exists snpstats (bioconductor R package) but the method of calculation seems to be a non standard one.

Can anyone suggest an R package/software that can calculate Fst(preferebly Weir And Cockerham) from PLINK format data as input?

plink fst gwas • 10k views
ADD COMMENT
0
Entering edit mode

Geneland seems to calculates Weir And Cockerham Fstat. Though you will have to convert plink files into matrix: "Diploid codominant genotype data. A matrix with one line per individual and 2 columns per locus"

ADD REPLY
1
Entering edit mode

Thanks for the reply. Geneland seems to be useful and i will try to give a shot to coverting the data into required format. Though it will be preferrable to have a tool which directly uses PLINK files since it seems to be pretty prevalent standard. I am pretty surprised by the scarcity of such tools.

ADD REPLY
5
Entering edit mode
10.4 years ago

How are your subpopulations defined? (I may just go ahead and add this to PLINK, but I first want to make sure that your subpopulations are defined in a PLINK-friendly manner.)

ADD COMMENT
0
Entering edit mode

This is a comment, not an answer.

ADD REPLY
0
Entering edit mode

The subpopulations are defined by the values 0(unaffected) and 1(affected) in the phenotype column of the bim/ped file. No sure if this is what you had asked for. But it would be great if PLINK itself natively supports this calculation as this is a pretty frequently used statistic and will be a significant addition to the already great PLNIK 1.9

ADD REPLY
1
Entering edit mode

Okay, I'll try to add a --weir-fst flag next week (analogous to VCFtools --weir-fst-pop) which uses the set of cases as the subpopulation, and can be combined with --loop-assoc to operate on many subpopulations at once.

ADD REPLY
1
Entering edit mode

Great. Thanks. Eagerly Looking forward to this release

ADD REPLY
3
Entering edit mode

Now implemented as --fst (or "--fst case-control" if your subpopulations are defined by case/control status).

ADD REPLY
1
Entering edit mode

Thanks a lot. Tried --fst case-control works really well. Great ease of use.

ADD REPLY
0
Entering edit mode

Hi, Didn't mean to hijack this post but I was wondering if there is a way to use --fst to calculate pairwise fst values for multiple populations? I tried --loop--assoc but it didn't work. Thanks!

ADD REPLY
0
Entering edit mode

Use --within to load a file defining your subpopulations.

ADD REPLY
0
Entering edit mode

I tried that. It gives one Fst value per SNP, so it's presumably calculating it over all subpopulations and not for all pairwise comparison (example headings given below):

SNP | Pop1 | Pop2 | Fst

Thank you.

ADD REPLY
1
Entering edit mode

Oops, missed the "pairwise" part of your remark. You will need to write a brief script with a double-for-loop for this; PLINK does not have that built-in.

ADD REPLY
0
Entering edit mode

Please how are the data formatted in subpopulation file ? Thanks...

ADD REPLY
0
Entering edit mode

Hi, it's a three column file, first two of the fam file plus a third one with the clusters (you can repeat the first column here or put whatever you need).

ADD REPLY

Login before adding your answer.

Traffic: 2018 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6