So there are various plink options
--ld-window-kb 1000 # This means only assess LD between SNPs within a specified window i.e. here within 1000 kb. With whole genome data, there are so many SNPs, you don't want to compare every snp to every other snp - the outfile would be too big. So this specifies to only compare ones within a certain distance. You should set this based on your knowledge of your species. i.e. if LD decays very quickly then this can be smaller. If LD is long in your taxon, this should be longer. In the example you provide this is what they mean by pairs of SNPs less than 70kb apart, i.e. they set --ld-window-kb 70
--ld-window-r2 0.2 # This means only output the r2 values greater than 0.2. (default is 0.2) So for an LD decay plot you may want to change that to 0 otherwise your averages may be off - see below (but this will make the file enormous, so you might want to do everything separately for each chromosome).
To get the the LD decay plot you need to do something like the following.
First of all - generate a text file with two columns - the first is the distance between the two SNPs (i.e. BP_B - BP_A, so if the first snp is at position 1000 on the chromosome, and the second is at position 2150, the distance is 1150 ) and the second is the corresponding R2 value (yoru last column).
So you get a file with R2 values for SNPs certain distances apart.
distance R2
5 0.2
5 0.3
67 0.2
67 0.4
67 0.5
Then for each distance apart, calculate an average R2 (you need to generate a script to do this e.g. in python/perl)
i.e.
distance averageR2
5 0.25
67 0.3667
Then plot that file to get the LD decay.
Depending on how many points you have, you may want to using a sliding window average script for the plot.
In the example, I think this is what they did - in terms of 1kb intervals, i.e. they did a sliding window - with a window size of 1kb and step size of 1kb, and calculated average r2.
fyi, it's not a big deal with small windows, but if you're doing any heavy-duty --r2 runs, PLINK 1.9 is much faster than PLINK 1.07.
hello, i need help in creating map and ped files in finding linkage disequilibrium for the data from dbsnp, am confused because in dbsnp some snps have 2-3 allele changes for same physical position and few snps have many physical positions,so am confused in creating those files. please help me