Entering edit mode
6.4 years ago
Ld_60
▴
80
Hi everyone,
I am trying to filter out "unimportant" genes from an Affymetrix microarray dataset using gene filtering methods (filtering by coefficient of variation, by intensity values,... etc), but as these methods require cutoff values to be specified (thresholds), I was wondering how I can find such cutoff values from the data, and whether I should consider the expression values before or after normalization ?
Thank you very much for your help.
Is there a specific reason why you feel that filtering should even be performed?
Hi Kevin, thanks for your reply. Indeed, I think filtering the genes is quite important in my case as my goal is to perform classification (predicting recurrence/no recurrence outcomes among cancer patients). So, selecting "potential" genes through the filtering step would be beneficial both computation time-wise and classifier performance-wise.
I see - yes - definitely important in that case. You may be interested in what I posted in the following threads, in that case:
Classification is a wide area, though, with tonnes of different approaches and methods. Many 'machine learning' methods now exist (I should re-phrase: most have existed for a long time but are only now getting focus), but I prefer the 'old faithful' of just going step by step and working through the data to identify the best predictors. In relation to this, you may be interested in what I wrote here:
Kevin
Many thanks for your response!