I am new to R and machine learning. I want to create a machine learning classifier which can classify between Normal and diseased sample using Differentially expressed genes obtained from GEO microarray datasets, as input features. I have obtained my DEGs using limma package. Now how to use DEGs to train the machine learning classifier ? plz help
Why do you want to use ML here? What arre some existing methods, and what are some flaws in them that you're trying to solve using ML?
Using machine learning i want to show that these DEGs can act as biomarkers by differentiating normal sample from a diseased sample
What is lacking in GSEA etc. that ML can solve? What is your definition of "normal" and "diseased"? What are your DE groups?
I have to design a project related to ML. I have taken GEO microarray dataset , it has microarray data for control sample and parkinson's sample obtained from blood. Have found DEGs using limma, now want to use these DEGs for ML classification of Control and Parkinson's sample.
That seems to be a sub-optimal way of approaching a problem. "Is ML useful here" should be the question. Anyway, like curious says, I'm sure there are a lot of people that have run classifiers on public datasets. Are you doing a toy project or a real one?
Its a real one, ML classification is first step in it
I think it is clear what you want to do. The thing is that biostars is intended to answer specific technical questions rather than guiding you along a topic that you apparently have no background in. I suggest you dive into the available online resources, textbooks and courses at your institution and get a solid foundation first. You will rarely find users online that will provide a end-to-end workflow for you, especially given that you want to develop something on your own.
I don't want an end to end workflow. I'll be happy if someone can suggest any particular R package to look at or any particular blog
A lot of people have done this kind of thing. I think classifiers for benign vs malignant thyroid tumors based of rna-seq was one i remember from a few years back. This is a googleable thing
Coincidentally, I have published in this area via a TCGA re-analysis, but not 'Machine Learning': Comprehensive transcriptomic analysis of papillary thyroid cancer: potential biomarkers associated with tumor progression
Yeah I tried to find the exact paper but couldn't, I think there is a groups of companies that do this though. Something like the sample needed for histology is invasive to get, so they just get a little bit of RNA and try to classify benign that way. Machine learning stuff is so in vogue, I think people want to use it to check a buzzword box, but I remember this application I thought was kind of neat and made sense.