As advised by biostars community in this post Differentially expressed genes machine learning classifer, i have got good basic knowledge of machine learning through online courses. My aim is to create a machine learning classifier which can classify between diseased and healthy conditions using differentially expressed genes obtained from NCBI GEO, as input features. I have obtained DEGs from limma- tobTable function. Now to train my machine learning model, do i need to create a new table which has DEGs in columns alongwith their expression values obtained from exprs function, and samples in rows ? Then add a new target column to be predicted ??
i want to train the model using gene expression data of obtained DEGs , is that approach correct ?
There is no correct or incorrect approach. It is possible to start from the entire dataset, and it is possible to start from the DEGs. It will depend, in part, on the program that you are aiming to use. AS I do not know what you are planning to do, I can only answer generally.
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5290715/ i want to do something like they have done in this paper, can u please guide ?
Sorry, I am not your supervisor.
Ok no problem sir, thanks for replying
I can see where you're coming from, but you're going to have to ask pointed questions to get answers here or elsewhere. Don't give up!!