Hello guys, how to validate a list of genes identified from a regularization method.??
I have applied a regularization method to identify a list of significant genes, but i want to know how to validate these identified lists of genes.
Note that these genes have enriched in several significant pathways in David. I really appreciate any help !
@Alex Reynolds Thanks for your response.
I already get my significant genes based on that method and a cutoff metric.
Now, i want to validate these genes and my question how to validate these genes?
Anyone else can help me to get a response about this question plz???
You have not given much information, so, people will unfortunately turn their heads away from your post.
Take a look at material that I have written here:
@Kevin Blighe Sorry for that but my main question is that i got a list of genes by applying an elastic net method and then i want to validate these genes. how to do that?
With the output from the elastic net regression, you can just build a new model (glm or lm) with your final list and then check it's ability to predict the end-point via r-squared shrinkage and ROC analysis. There are many other metrics that can be applied. I go over some of this in the posts mentioned above.
@ Kevin Blighe wht do you mean by end point plz?
By end-point, I mean the dependent (y) variable. For example, in this formula,
glm(condition ~ gene1 + covariate)
,condition
is the end-point.@Kevin Blighe, i read all the materials you recommended but i still can(t get a response of my question how to validate a list of genes Suppose, i applied the elastic net method on my specific data and get as output 50 genes and so how to validate these 50 genes ?
Chief, take a look at what I do here: A: How to exclude some of breast cancer subtypes just by looking at gene expressio
When you applied the elastic net regression, I presume that you cross-validated it at the same time using
cv.glmnet
?Essentially, with your 50 genes, build a 'final' model via
glm()
orlm()
with all of your genes and then test that via ROC analysis. You can also derive sensitivity / specificity / precision / accuracy:Assume that your final model is called model, and your data is data:
Sensitivity / Specificity
Determine cost function
Precision
Accuracy
Note that 50 genes may be too many for a final model. You may consider reducing this number via stepwise regression (see lecture notes 3, here: https://github.com/kevinblighe/Rtutorials ) or by choosing a higher threshold from elastic net regression.