The tidymodels team recently released new versions of the tune, finetune, and workflowsets packages, and we’re super stoked about it! Each of these three packages facilitates tuning hyperparameters in tidymodels, and their new releases work to make the experience of hyperparameter tuning more joyful.
You can install these releases from CRAN with:
install.packages(c("tune", "workflowsets", "finetune"))
This blog post will highlight some of new changes in these packages that we’re most excited about.
You can see the full lists of changes in the release notes for each package:
A shorthand for fitting the optimal model
In tidymodels, the result of tuning a set of hyperparameters is a data structure describing the candidate models, their predictions, and the performance metrics associated with those predictions. For example, tuning the number of neighbors
in a nearest_neighbors()
model over a regular grid:
# tune the `neighbors` hyperparameter
knn_model_spec <- nearest_neighbor("regression", neighbors = tune())
tuning_res <-
tune_grid(
knn_model_spec,
mpg ~ .,
bootstraps(mtcars, 5),
control = control_grid(save_workflow = TRUE)
)
# check out the resulting object
tuning_res
#> # Tuning results
#> # Bootstrap sampling
#> # A tibble: 5 × 4
#> splits id .metrics .notes
#> <list> <chr> <list> <list>
#> 1 <split [32/11]> Bootstrap1 <tibble [20 × 5]> <tibble [0 × 3]>
#> 2 <split [32/12]> Bootstrap2 <tibble [20 × 5]> <tibble [0 × 3]>
#> 3 <split [32/11]> Bootstrap3 <tibble [20 × 5]> <tibble [0 × 3]>
#> 4 <split [32/10]> Bootstrap4 <tibble [20 × 5]> <tibble [0 × 3]>
#> 5 <split [32/12]> Bootstrap5 <tibble [20 × 5]> <tibble [0 × 3]>
# examine proposed hyperparameters and associated metrics
collect_metrics(tuning_res)
#> # A tibble: 20 × 7
#> neighbors .metric .estimator mean n std_err .config
#> <int> <chr> <chr> <dbl> <int> <dbl> <chr>
#> 1 2 rmse standard 3.19 5 0.208 Preprocessor1_Model01
#> 2 2 rsq standard 0.664 5 0.0861 Preprocessor1_Model01
#> 3 3 rmse standard 3.13 5 0.266 Preprocessor1_Model02
#> 4 3 rsq standard 0.678 5 0.0868 Preprocessor1_Model02
#> 5 4 rmse standard 3.11 5 0.292 Preprocessor1_Model03
#> 6 4 rsq standard 0.684 5 0.0851 Preprocessor1_Model03
#> 7 5 rmse standard 3.10 5 0.287 Preprocessor1_Model04
#> 8 5 rsq standard 0.686 5 0.0839 Preprocessor1_Model04
#> 9 8 rmse standard 3.08 5 0.263 Preprocessor1_Model05
#> 10 8 rsq standard 0.689 5 0.0843 Preprocessor1_Model05
#> 11 9 rmse standard 3.07 5 0.256 Preprocessor1_Model06
#> 12 9 rsq standard 0.691 5 0.0845 Preprocessor1_Model06
#> 13 10 rmse standard 3.06 5 0.247 Preprocessor1_Model07
#> 14 10 rsq standard 0.693 5 0.0837 Preprocessor1_Model07
#> 15 11 rmse standard 3.05 5 0.241 Preprocessor1_Model08
#> 16 11 rsq standard 0.696 5 0.0833 Preprocessor1_Model08
#> 17 13 rmse standard 3.03 5 0.236 Preprocessor1_Model09
#> 18 13 rsq standard 0.701 5 0.0820 Preprocessor1_Model09
#> 19 14 rmse standard 3.02 5 0.235 Preprocessor1_Model10
#> 20 14 rsq standard 0.704 5 0.0808 Preprocessor1_Model10
Given these tuning results, the next steps are to choose the “best” hyperparameters, assign those hyperparameters to the model, and fit the finalized model on the training set. Previously in tidymodels, this has felt like:
# choose a method to define "best" and extract the resulting parameters
best_param <- select_best(tuning_res, "rmse")
# assign those parameters to model
knn_model_final <- finalize_model(knn_model_spec, best_param)
# fit the finalized model to the training set
knn_fit <- fit(knn_model_final, mpg ~ ., mtcars)
Voilà! knn_fit
is a properly resampled model that is ready to
predict()
on new data:
predict(knn_fit, mtcars[1, ])
#> # A tibble: 1 × 1
#> .pred
#> <dbl>
#> 1 22.0
The newest release of tune introduced a shorthand interface for going from tuning results to final fit called
fit_best()
. The function wraps each of those three functions with sensible defaults to abbreviate the process described above.
knn_fit_2 <- fit_best(tuning_res)
predict(knn_fit_2, mtcars[1, ])
#> # A tibble: 1 × 1
#> .pred
#> <dbl>
#> 1 22.0
This function is closely related to the
last_fit()
function. They both give you access to a workflow fitted on the training data but are situated somewhat differently in the modeling workflow.
fit_best()
picks up after a tuning function like
tune_grid()
to take you from tuning results to fitted workflow, ready for you to predict and assess further.
last_fit()
assumes you have made your choice of hyperparameters and finalized your workflow to then take you from finalized workflow to fitted workflow and further to performance assessment on the test data. While
fit_best()
gives a fitted workflow,
last_fit()
gives you the performance results. If you want the fitted workflow, you can extract it from the result of
last_fit()
via
extract_workflow()
.
The newest release of the workflowsets package also includes a
fit_best()
method for workflow set objects. Given a set of tuning results, that method will sift through all of the possible models to find and fit the optimal model configuration.
Interactive issue logging
Imagine, in the previous example, we made some subtle error in specifying the tuning process. For example, passing a function to extract
elements of the proposed workflows that injects some warnings and errors into the tuning process:
raise_concerns <- function(x) {
warning("Ummm, wait. :o")
stop("Eep! Nooo!")
}
tuning_res <-
tune_grid(
knn_model_spec,
mpg ~ .,
bootstraps(mtcars, 5),
control = control_grid(extract = raise_concerns)
)
Warnings and errors can come up in all sorts of places while tuning hyperparameters. Often, with obvious issues, we can raise errors early on and halt the tuning process, but with more subtle concerns, we don’t want to be too restrictive; it’s sometimes better to defer to the underlying modeling packages to decide what’s a dire issue versus something that can be worked around.
In the past, we’ve raised warnings and issues as they occur, printing context on the issue to the console before logging the issue in the tuning result. In the above example, this would look like:
! Bootstrap1: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap1: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap2: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap2: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap3: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap3: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap4: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap4: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap5: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap5: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
The above messages are super descriptive about where issues occur—they note in which resample, from which proposed modeling workflow, and in which part of the fitting process the issues occurred in. At the same time, they are quite repetitive; if there’s an issue during hyperparameter tuning, it probably occurs in every resample, always in the same place. If, instead, we were evaluating this model against 1,000 resamples, or there were more than just two issues, this output could get very overwhelming very quickly.
The new releases of our tuning packages include tools to determine which tuning issues are unique, and for each unique issue, only print out the message once while maintaining a dynamic count of how many times the issue occurred. With the new tune release, the same output would look like:
#> → A | warning: Ummm, wait. :o
#> → B | error: Eep! Nooo!
#> There were issues with some computations A: x5 B: x5
This interface is hopefully less overwhelming for users. When the messages attached to these issues aren’t enough to debug the issue, the complete set of information about the issues lives inside of the tuning result object, and can be retrieved with collect_notes(tuning_res)
. To turn off the interactive logging, set the verbose
control option to TRUE
.
Speedups
Each of these three releases, as well as releases of core tidymodels packages they depend on like parsnip, recipes, and hardhat, include a plethora of changes meant to optimize computational performance. Especially for modeling practitioners who work with many resamples and/or small data sets, our modeling workflows will feel a whole lot snappier:
With 100-row training data sets, the time to resample models with tune and friends has been at least halved. These releases are the first iteration of a set of changes to reduce the evaluation time of tidymodels code, and users can expect further optimizations in coming releases! See this post on my blog for more information about those speedups.
Bonus points
Although they’re smaller in scope, we wanted to highlight two additional developments in tuning hyperparameters with tidymodels.
Workflow set support for tidyclust
The recent tidymodels package
tidyclust introduced support for fitting and tuning clustering models in tidymodels. That package’s function
tune_cluster()
is now an option for tuning in
workflow_map()
, meaning that users can fit sets of clustering models and preprocessors using workflow sets. These changes further integrate the tidyclust package into tidymodels framework.
Refined retrieval of intermediate results
The .Last.tune.result
helper stores the most recent tuning result in the object .Last.tune.result
as a fail-safe in cases of interrupted tuning, uncaught tuning errors, and simply forgetting to assign tuning results to an object.
# be a silly goose and forget to assign results
tune_grid(
knn_model_spec,
mpg ~ .,
bootstraps(mtcars, 5),
control = control_grid(save_workflow = TRUE)
)
#> # Tuning results
#> # Bootstrap sampling
#> # A tibble: 5 × 4
#> splits id .metrics .notes
#> <list> <chr> <list> <list>
#> 1 <split [32/11]> Bootstrap1 <tibble [18 × 5]> <tibble [0 × 3]>
#> 2 <split [32/14]> Bootstrap2 <tibble [18 × 5]> <tibble [0 × 3]>
#> 3 <split [32/13]> Bootstrap3 <tibble [18 × 5]> <tibble [0 × 3]>
#> 4 <split [32/12]> Bootstrap4 <tibble [18 × 5]> <tibble [0 × 3]>
#> 5 <split [32/11]> Bootstrap5 <tibble [18 × 5]> <tibble [0 × 3]>
# all is not lost!
.Last.tune.result
#> # Tuning results
#> # Bootstrap sampling
#> # A tibble: 5 × 4
#> splits id .metrics .notes
#> <list> <chr> <list> <list>
#> 1 <split [32/11]> Bootstrap1 <tibble [18 × 5]> <tibble [0 × 3]>
#> 2 <split [32/14]> Bootstrap2 <tibble [18 × 5]> <tibble [0 × 3]>
#> 3 <split [32/13]> Bootstrap3 <tibble [18 × 5]> <tibble [0 × 3]>
#> 4 <split [32/12]> Bootstrap4 <tibble [18 × 5]> <tibble [0 × 3]>
#> 5 <split [32/11]> Bootstrap5 <tibble [18 × 5]> <tibble [0 × 3]>
# assign to object after the fact
res <- .Last.tune.result
These three releases introduce support for the .Last.tune.result
object in more settings and refine support in existing implementations.
Acknowledgements
Thanks to @walrossker, @Freestyleyang, and @Jeffrothschild for their contributions to these packages since their last releases.
Happy modeling, y’all!