NUMERICAL missing at random
|
The algorithm replaces missing numerical values with the mean.
For Expectation Maximization (EM), the replacement only occurs in columns that are modeled with Gaussian distributions.
|
The algorithm handles missing values naturally as missing at random.
|
The algorithm interprets all missing data as sparse.
|
CATEGORICAL missing at random
|
Genelized Linear Models (GLM), Non-Negative Matrix Factorization (NMF), k-Means, and Support Vector Machine (SVM) replaces missing categorical values with the mode.
Singular Value Decomposition (SVD) does not support categorical data.
EM does not replace missing categorical values. EM treats NULLs as a distinct value with its own frequency count.
|
The algorithm handles missing values naturally as missing random.
|
The algorithm interprets all missing data as sparse.
|
NUMERICAL sparse
|
The algorithm replaces sparse numerical data with zeros.
|
O-Cluster does not support nested data and therefore does not support sparse data. Decision Tree (DT), Minimum Description Length (MDL), and Naive Bayes (NB) and replace sparse numerical data with zeros.
|
The algorithm handles sparse data.
|
CATEGORICAL sparse
|
All algorithms except SVD replace sparse categorical data with zero vectors. SVD does not support categorical data.
|
O-Cluster does not support nested data and therefore does not support sparse data. DT, MDL, and NB replace sparse categorical data with the special value DM$SPARSE .
|
The algorithm handles sparse data.
|