Hey so here I have an R script I use to make Q-Q plots for GWA SNP P-values with FDR lines. How could I determine with programming which P-values (or SNPs, really) are significant enough to have passed the 0.05 FDR line? I'm just really not sure how to approach this. Example below!
qqunif = function(p, BH=T, MAIN = " ", SUB=" ")
{
nn = length(p)
xx = -log10((1:nn)/(nn+1))
plot( xx, -sort(log10(p)),
main = MAIN, sub= SUB, cex.sub=1.3,
xlab=expression(Expected~~-log[10](italic(p))),
ylab=expression(Observed~~-log[10](italic(p))),
cex.lab=1.0,mgp=c(2,1,0))
abline(0,1,col='gray')
if(BH) ## BH = include Benjamini Hochberg FDR
{
abline(-log10(0.05),1, col='purple',lty=1)
abline(-log10(0.10),1, col='blue',lty=1)
abline(-log10(0.25),1, col='green',lty=1)
legend('topleft', c("FDR = 0.05","FDR = 0.10","FDR = 0.25"),
col=c('purple','blue','green'),lty=c(1,1,1), cex=0.8)
if (BF)
{
abline(h=-log10(0.05/nn), col='black') ## bonferroni
}
}
}
EXAMPLE: I can SEE that there are 25 SNPs that passed the 0.05 FDR line, but how can I get a list of just those 25 SNPs in a new file? Thing is sometimes I work with much larger files. My input files are really just: rs_number pval
Hopefully someone can help me with this