Hello,
I am trying to plot LD decay with a sliding window in R. What I want to have on my X axis is distance between 2 SNPs while my Y axis can have the mean r^2 value.
So far this is what I have:
library(zoo)
# Find distances between pairs of SNPs
dist18 <- (a18[,2] - a18[,1])
# Store r^2 and distances in a zoo matrix
D18 <- zoo(a18[,13], dist18)
plot(rollapply(D18, width = 10, by = 1, FUN = mean, align = "left"), main="r^2 along genome, window of 10", ylab="r^2 (MLE)", xlab="Genome Position (bp)")
My insert sizes for my reads are not longer than 300bp, so I can only plot up to 300bp. For something like that, a window size of 10 sliding by 1 should eliminate a lot of noise and show how the mean changes. What ends up happening however is this:
head(D18)
1 1 1 1 1 1
0 0 0 0 0 0
Warning message:
In zoo(rval[i], index(x)[i]) :
some methods for \u201czoo\u201d objects do not work if the index entries in \u2018order.by\u2019 are not unique
Indeed, the values are not unique, and there are many values for a distance of 1bp, 2bp... so when I plot a sliding window, it is not a sliding window of basepairs, but rather of values in the vector (the first 10 values of r^2 all correspond to a distance of 1bp).
This causes the plot to still be noise, even with windows of 50.
My question is, how can I get an average of all values of r^2 for distance of 1bp, then average for 2bp... store that in a vector, then plot?
Thank you.
Adrian
Hello, your tool requires phased VCF data. I am not sure how to phase my data, since I am working on non-models.