Running survival analysis using Rplink
0
0
Entering edit mode
9.9 years ago
Caragh ▴ 40

Hi all,

I am trying to run a survival analysis on my data looking at SNPs which influence speed to develop the disease of interest. I have seen previous studies which have done this using the R plugin for PLINK but I am having difficulty running it.

I'm not sure how to specify the survival time and onset of the disease within the code. I have looked online for examples but the only one I can find is on the PLINK website which doesn't go into specifics. The code I'm using is as follows:

../plink --bfile testsnps --R cox_script_rplugin.txt --out test_snps_surv --noweb

cox_script_rplugin.txt is as follows:

library(survival)
Rplink <- function(PHENO,GENO,CLUSTER,COVAR)
{
  f1 <- function(s) 
  {
    m <- summary( coxph( Surv( PHENO [,1], COVAR[,2] ) ~ s ) )
    r <- c( m$coef , m$loglik, m$n )
    c( length(r) , r )
  }
  apply( GENO , 2 , f1 )
}

I have set up my fam file so that the first phenotype column is survival time and the 2nd phenotype column is whether or not the individual got the disease.

The results I'm getting out are:

  11   rs4910084    9915879    A NA
  11  rs10500715    9929638    C NA
  11   rs7952455    9936337    A NA
  14   rs7157940   93632946    A NA
  17  rs11078308   15411904    A NA

Does anyone have an example of the script they used for this type of analysis? Any help would be greatly appreciated.

Many thanks,
Caragh

Cox Survival R GWAS PLINK • 4.7k views
ADD COMMENT
1
Entering edit mode

Hi

I'm also new here so I could not confirm what is wrong. Our group has only successfully used R plugin to calculate the allele frequency (another example shown in P-link document) at the moment. I'll try cox regression in the following weeks. But if you don't mind, I could discuss with you first.

Could you please run plink --file mydata --R myscript.R --R-debug and show me the content of mylog.R?

ADD REPLY
0
Entering edit mode

Yes, I would be happy to discuss! The content of mylog.R after debugging was as follows:

n <- 55
PHENO <- c( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 )
COVAR <- matrix( NA , nrow = n , ncol = 0 , byrow = T)
CLUSTER <- c( 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 )
l <- 5
g <- c( 1, 1, 0, 0, 1, 0, 0, 0, 1, 1, 2, 1, 1, 1, 2, 1, 1, 1, 2, 1, 0, 1, 0, 1, 2, 1, 1, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 2, 2, 0, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 1, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 2, 1, 1, 1, 0, 1, 1, 2, 1, 1, 2, 0, 0, 0, 0, 0, 2, 1, 1, 0, 2, 0, 0, 0, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 2, 2, 1, 1, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 1, 0, 0, 0, 1, 0, 1, 1, 0, 1, 2, 0, 0, 0, 0, 0, 1, 1, 0, 1, 0, 1, 1, 0, 0, 1, 1, 1, 1, 0, 0, 1, 1, 0, 1, 1, 1, 1, 0, 2, 0, 2, 2, 0, 1, 0, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 2, 0, 0, 0, 0, 1, 0, 0, 1, 0, 0, 2, 1, 1, 1, 0, 2, 2, 2, 1, 2, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 2, 1, 1, 0, 1, 0, 0, 0, 0, 1, 2, 2, 2, 0, 1, 0, 2, 2, 0, 2, 1, 1, 1, 0, 1, 1, 2, 2, 1, 0, 0, 2, 2, 1, 1, 1 )
GENO <- matrix( g , nrow = n ,byrow=T)
GENO[GENO == -1 ] <- NA

library(survival)
Rplink <- function(PHENO,GENO,CLUSTER,COVAR)
{
  f1 <- function(s)
  {
    m <- summary( coxph( Surv( PHENO [,1], COVAR[,2] ) ~ s ) )
    r <- c( m$coef , m$loglik, m$n )
    c( length(r) , r )
  }
  apply( GENO , 2 , f1 )
}
ADD REPLY
0
Entering edit mode

Look at your COVAR <- matrix(NA, nrow = n, ncol = 0, byrow = T) part. No columns inside right?m <- summary(coxph(Surv(PHENO [,1], COVAR[,2] ) ~ s)) the input is empty.

ADD REPLY
0
Entering edit mode

And your PHENO <- c( 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1 ) part. All the individuals you put in are controls? If not, are you coding your case/control as 1/0? P-Link by default is 2/1 if I remember correctly.

ADD REPLY
0
Entering edit mode

No the cases and controls are specified in the fam file (there should be 325 in total) but they're not being translated across for some reason.

ADD REPLY
0
Entering edit mode

I guess the reason you get all NAs because the survival time is empty~ Are you willing to fix this part and see what is going on first?

ADD REPLY
0
Entering edit mode

I have specified the survival time in the fam file (look at the original question) but I can't figure out why it is not translating to the script. Do you know how to specify that?

ADD REPLY
0
Entering edit mode

The fam file is generated from ped file right? Then it should contain the first 6 columns of your ped file. Why do you say that the first phenotype column is survival time and the 2nd phenotype column is whether or not the individual got the disease?

ADD REPLY
0
Entering edit mode

Because you can specify the phenotype you want to use. This can be added to the fam file. Should I input it separately? If so how do I do this?

ADD REPLY
0
Entering edit mode

Hi, I'm currently doing something similar as you are doing. I'm just wondering whether all the result you got were like that?

ADD REPLY
0
Entering edit mode

Yes I'm afraid they were all like this. Have you been able to get it to work?

ADD REPLY

Login before adding your answer.

Traffic: 2015 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6