
Maximum information preservable by supervised ordinal lumping
Source:R/supervised.R
maximum_mutual_information_ordinal_supervised.RdCalculates the lumping of an ordinal categorical covariate that preserves the maximum mutual information between the lumped covariate and a discrete outcome variable.
Value
A list containing information about the optimal lumping:
- mutual_information
Double representing the mutual information between the lumped and unlumped variable.
- loss
Double representing the amount of entropy lost in the lumping process.
- lumping
Integer vector containing, sequentially, the points at which the lumped levels are separated. Lower bound inclusive and upper bound exclusive, so that if a_1,...,a_k is returned, the lumped levels correspond to the levels [a_1, a_2), ..., [a_(k-1), a_k).
See also
maximum_mutual_information_ordinal() for the unsupervised version.
maximum_mutual_information_ordinal_supervised_continuous() for a version that accepts a continuous outcome.