Best order of imputation and normalization in data preprocessing

0

Entering edit mode

4.9 years ago

david.f.stein ▴ 10

I am building a logistic regression classifier using scikit-learn. I have some continuous data with missing values that I would like to impute. I am curious if it is considered better practice to impute before or after normalization. I have tried both and have not noticed a difference in my models performance. However, a colleague suggested that they thought imputation should be performed first, and I can understand their intuition. Does anyone have any insight on this matter? Any literature concerning this would also be appreciated.

Thanks!

gene genome next-gen • 1.5k views

ADD COMMENT • link updated 4.8 years ago by Biostar 20 • written 4.9 years ago by david.f.stein ▴ 10

Login before adding your answer.

Similar Posts

Loading Similar Posts

Traffic: 2100 users visited in the last hour

Content Search
Users
Tags
Badges

Help About
FAQ

Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the

version 2.3.6