Calculate the entropy of a vector
entropy(v, pseudo_count = length(v)/1000, na.action = na.fail)
np(v, p = 0.05, pseudo_count = p/5, na.action = na.fail)
modal_category(v, na.action = na.fail)
categorical vector
number of pseudo counts to add on, to stabilize empty categories
how to handle NA values
proportion threshold
the sample entropy
np
: The number of categories exceeding p
proportion of the total
modal_category
: The modal category of v. Ties are broken by lexicographic order of the factor levels.
v2 = gl(2, 4)
v4 = gl(4, 4)
stopifnot(entropy(v2) < entropy(v4))
v_empty = v2[1:4] #empty level 2
stopifnot(is.finite(entropy(v_empty))) # pseudo_count
np(v4, p = .2, pseudo_count = 0)
#> [1] 4
np(v4, p = .25, pseudo_count = 0)
#> [1] 0
np(v4, p = .25, pseudo_count = .0001)
#> [1] 4
modal_category(v4)
#> [1] "1"
modal_category(v4[-1])
#> [1] "2"