Hi all, I'm both new to the forums, and new to methylation analysis
I'm currently exploring the functions of ChAMP, which seems very useful. I want to impute some missing values in my dataset - which is a beta value matrix and an additional sample sheet, and I'd like to know how champ.impute() works. From looking at the documentation I guess these are important arguments: pd=myLoad$pd, k=5, method="Combine"
So when I look at the function of champ.impute at: https://github.com/Bioconductor-mirror/ChAMP/blob/master/R/champ.impute.R I gather that myLoad$pd is updated after removing 'valid columns', so I need my rows in the same order as my beta value matrix. K is used to select the number of neighbours impute.knn will use if method is combine or knn.
So after digging deeper into impute.knn(), from the documentation: For each gene with missing values, we find the k nearest neighbors using a Euclidean metric.
So actually the neighbors are the calculated neighbors, not the one I specified through a certain 'phenotype' or 'sample_group' column in the myLoad$pd file.
Is there a way to impute using neighbors from the same sample group using ChAMP? Would this be biologically sound, or is the calculated neighbor approach better?