Entering edit mode
5 weeks ago
Ahmed
•
0
Hi, All I am building a supervised model on DNA methylation data for liver cancer. A general practice for feature selection is we remove the highly correlated columns. But, I am concerned that by doing this with methylation data , I can lose important features which may lead incorrect or biased prediction.
You will need to provide a bit more info in terms of what you are doing and what specifically you want help with?
What platform is your data - WGBS / methylation array etc?
What structure is your data in - what are the rows / columns?
What are you correlating and why?
What is your model supposed to be used for?