A new version of dials
is
on CRAN. The package has contains basic frameworks for managing tuning parameters for models. It is a significant update to the package. The major change is that parameter objects are now generated by functions (as opposed to the prototype objects in the previous version). For example, to make a dials
object for the number of PCA components in a model:
# previously
pca_comps <- num_comp
# now
pca_comps <- num_comp()
For numeric parameters, the range of values can be set using the first argument:
library(tidymodels)
## ── Attaching packages ──────────────────────────────────────── tidymodels 0.0.2 ──
## ✔ broom 0.5.2 ✔ purrr 0.3.2
## ✔ dials 0.0.3 ✔ recipes 0.1.7
## ✔ dplyr 0.8.3 ✔ rsample 0.0.5
## ✔ ggplot2 3.2.1 ✔ tibble 2.1.3
## ✔ infer 0.4.0.1 ✔ yardstick 0.0.4
## ✔ parsnip 0.0.3.1
## ── Conflicts ─────────────────────────────────────────── tidymodels_conflicts() ──
## ✖ purrr::discard() masks scales::discard()
## ✖ dplyr::filter() masks stats::filter()
## ✖ dplyr::lag() masks stats::lag()
## ✖ ggplot2::margin() masks dials::margin()
## ✖ dials::offset() masks stats::offset()
## ✖ recipes::step() masks stats::step()
num_comp()
## # Components (quantitative)
## Range: [1, ?]
num_comp(range = c(2, 10))
## # Components (quantitative)
## Range: [2, 10]
Sets of tuning parameters can be created and managed:
boosting_set <- param_set(list(trees(), splits = tree_depth(), min_n()))
boosting_set
## Collection of 3 parameters for tuning
##
## id parameter type object class
## trees trees nparam[+]
## splits tree_depth nparam[+]
## min_n min_n nparam[+]
# modifying the parameter range:
boosting_set %>% update(trees = trees(c(100, 1000)))
## Collection of 3 parameters for tuning
##
## id parameter type object class
## trees trees nparam[+]
## splits tree_depth nparam[+]
## min_n min_n nparam[+]
Note that the tree depth parameter has a user-defined identification variable. This can come in handy when there are multiple tuning parameters of the same type. For example, suppose two variables (x1
and x2
) were modeled using splines. The flexibility of each grouped be represented in a parameter set:
splines <- param_set(list(x1_df = deg_free(), x2_df = deg_free()))
splines
## Collection of 2 parameters for tuning
##
## id parameter type object class
## x1_df deg_free nparam[+]
## x2_df deg_free nparam[+]
This version of dials
also contains two functions for creating
space-filling designs, a technique from statistical experimental design theory. The two functions are grid_max_entropy()
and grid_latin_hypercube()
.
svm_set <- param_set(list(rbf_sigma(), cost()))
set.seed(463)
me_grid <- grid_max_entropy(svm_set, size = 20) %>% mutate(type = "max entropy")
ls_grid <- grid_latin_hypercube(svm_set, size = 20) %>% mutate(type = "latin hypercube")
rn_grid <- grid_random(svm_set, size = 20) %>% mutate(type = "random")
bind_rows(me_grid, ls_grid, rn_grid) %>%
ggplot(aes(x = cost, y = rbf_sigma)) +
geom_point() +
facet_wrap( ~ type) +
scale_x_log10() +
scale_y_log10() +
coord_fixed(ratio = 1/4)
dials
will be central to the upcoming framework for optimizing tuning parameters so there is much more to come regarding this package.