We’re tickled pink to announce the release of desirability2 (version 0.0.1). You can install it from CRAN with:
install.packages("desirability2")
This blog post will introduce you to the package and desirability functions.
Let’s load some packages!
library(desirability2)
library(dplyr)
library(ggplot2)
Desirability functions are tools that can be used to rank or optimize multiple characteristics at once. They are intuitive and easy to use. There are a few R packages that implement them, including desirability and desiR.
We have a new one, desirability2, with an interface conducive to being used in-line via dplyr pipelines.
Let’s demonstrate that by looking at an application. Suppose we created a classification model and produced multiple metrics on how well it classifies new data. We measured the area under the ROC curve and the binomial log-loss statistic in this example. There are about 300 different model configurations that we investigated via tuning.
The results from the tuning process were:
classification_results
## # A tibble: 298 × 5
## mixture penalty mn_log_loss roc_auc num_features
## <dbl> <dbl> <dbl> <dbl> <int>
## 1 0 0.1 0.199 0.869 211
## 2 0 0.0788 0.196 0.870 211
## 3 0 0.0621 0.194 0.871 211
## 4 0 0.0489 0.192 0.872 211
## 5 0 0.0386 0.191 0.873 211
## 6 0 0.0304 0.190 0.873 211
## 7 0 0.0240 0.188 0.874 211
## 8 0 0.0189 0.188 0.874 211
## 9 0 0.0149 0.187 0.874 211
## 10 0 0.0117 0.186 0.874 211
## # ℹ 288 more rows
If we were interested in the best area under the ROC curve:
classification_results |> slice_max(roc_auc, n = 1)
## # A tibble: 1 × 5
## mixture penalty mn_log_loss roc_auc num_features
## <dbl> <dbl> <dbl> <dbl> <int>
## 1 0.222 0.00574 0.185 0.876 86
However, there are different optimal settings when the log-likelihood is considered:
classification_results |> slice_min(mn_log_loss, n = 1)
## # A tibble: 1 × 5
## mixture penalty mn_log_loss roc_auc num_features
## <dbl> <dbl> <dbl> <dbl> <int>
## 1 1 0.000853 0.184 0.876 103
Are the two metrics related? Here’s a plot of the data:
classification_results |>
ggplot(aes(roc_auc, mn_log_loss, col = num_features)) +
geom_point(alpha = 1/2)
We colored the point using the number of features used in the model. Fewer predictors are better; we’d like to factor that into the tuning parameter selection.
To optimize them all at once, desirability functions map their values to be between zero and one (with the latter being the most desirable). For the ROC scores, a value of 1.0 is best, and we may not consider a model with an AUC of less than 0.80. We can use desirability2’s
d_max()
function to translate these values to desirability:
classification_results %>%
mutate(roc_d = d_max(roc_auc, high = 1, low = 0.8)) %>%
ggplot(aes(roc_auc, roc_d)) +
geom_line() +
geom_point() +
lims(y = 0:1)
Note that all model configurations with ROC AUC scores below 0.80 have zero desirability.
Since we want to reduce loss, we can use d_min()
to show a curve where smaller is better. For this specification, we’ll use the min and max values as defined by the data, by setting use_data = TRUE
:
classification_results %>%
mutate(
roc_d = d_max(roc_auc, high = 1, low = 0.8),
loss_d = d_min(mn_log_loss, use_data = TRUE)
) %>%
ggplot(aes(mn_log_loss, loss_d)) +
geom_line() +
geom_point() +
lims(y = 0:1)
Finally, we can factor in the number of features. Arguably this is more important to use than the other two outcomes; we will make this curve nonlinear so that it becomes more challenging to be desirable as the number of features increases. For this, we’ll use the scale
option to d_min()
, where larger values make the criteria more difficult to satisfy:
classification_results %>%
mutate(
roc_d = d_max(roc_auc, high = 1, low = 0.8),
loss_d = d_min(mn_log_loss, use_data = TRUE),
feat_d = d_min(num_features, low = 0, high = 100, scale = 2)
) %>%
ggplot(aes(num_features, feat_d)) +
geom_line() +
geom_point() +
lims(y = 0:1)
Combining these components into a single criterion using the geometric mean is common. Using this statistic has the side effect that any criteria with zero desirability make the overall desirability zero (since the geometric mean multiples the values). There is a function called
d_overall()
that can be used with dplyr’s across()
function. Sorting by overall desirability gives us tuning parameter values (mixture
and penalty
) that are best for this combination of criteria.
classification_results %>%
mutate(
roc_d = d_max(roc_auc, high = 1, low = 0.8),
loss_d = d_min(mn_log_loss, use_data = TRUE),
feat_d = d_min(num_features, low = 0, high = 100, scale = 2),
overall = d_overall(across(ends_with("_d")))
) %>%
slice_max(overall, n = 5)
## # A tibble: 5 × 9
## mixture penalty mn_log_loss roc_auc num_features roc_d loss_d feat_d overall
## <dbl> <dbl> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <dbl>
## 1 1 0.00924 0.200 0.859 15 0.295 0.815 0.722 0.558
## 2 0.667 0.0117 0.199 0.862 18 0.311 0.827 0.672 0.557
## 3 0.667 0.0149 0.201 0.858 14 0.291 0.802 0.740 0.557
## 4 0.889 0.00924 0.199 0.861 18 0.305 0.825 0.672 0.553
## 5 0.889 0.0117 0.201 0.857 14 0.285 0.801 0.740 0.553
That’s it! That’s the package.