Best way to use champ.impute or impute.knn - neighbor approach?
0
0
Entering edit mode
7.1 years ago
Mathias ▴ 90

Hi all, I'm both new to the forums, and new to methylation analysis

I'm currently exploring the functions of ChAMP, which seems very useful. I want to impute some missing values in my dataset - which is a beta value matrix and an additional sample sheet, and I'd like to know how champ.impute() works. From looking at the documentation I guess these are important arguments: pd=myLoad$pd, k=5, method="Combine"

So when I look at the function of champ.impute at: https://github.com/Bioconductor-mirror/ChAMP/blob/master/R/champ.impute.R I gather that myLoad$pd is updated after removing 'valid columns', so I need my rows in the same order as my beta value matrix. K is used to select the number of neighbours impute.knn will use if method is combine or knn.

So after digging deeper into impute.knn(), from the documentation: For each gene with missing values, we find the k nearest neighbors using a Euclidean metric.

So actually the neighbors are the calculated neighbors, not the one I specified through a certain 'phenotype' or 'sample_group' column in the myLoad$pd file.

Is there a way to impute using neighbors from the same sample group using ChAMP? Would this be biologically sound, or is the calculated neighbor approach better?

ChAMP impute R • 1.9k views
ADD COMMENT

Login before adding your answer.

Traffic: 1548 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6