Perform imputation of a data frame using k-NN.
knn.impute.Rd
Perform imputation of missing data in a data frame using the k-Nearest Neighbour algorithm. For discrete variables we use the mode, for continuous variables the median value is instead taken.
Arguments
- data
a numerical matrix.
- k
number of neighbours to be used; for categorical variables the mode of the neighbours is used, for continuous variables the median value is used instead. Default: 10.
- cat.var
vector containing the indices of the variables to be considered as categorical. Default: all variables.
- to.impute
vector indicating which rows of the dataset are to be imputed. Default: impute all rows.
- using
vector indicating which rows of the dataset are to be used to search for neighbours. Default: use all rows.