Tuning hyperparameters with tidymodels is a delight

The tidymodels team recently released new versions of the tune, finetune, and workflowsets packages, and we’re super stoked about it! Each of these three packages facilitates tuning hyperparameters in tidymodels, and their new releases work to make the experience of hyperparameter tuning more joyful.

You can install these releases from CRAN with:

install.packages(c("tune", "workflowsets", "finetune"))

This blog post will highlight some of new changes in these packages that we’re most excited about.

You can see the full lists of changes in the release notes for each package:

library(tidymodels)
library(finetune)

A shorthand for fitting the optimal model

In tidymodels, the result of tuning a set of hyperparameters is a data structure describing the candidate models, their predictions, and the performance metrics associated with those predictions. For example, tuning the number of neighbors in a nearest_neighbors() model over a regular grid:

# tune the `neighbors` hyperparameter
knn_model_spec <- nearest_neighbor("regression", neighbors = tune())

tuning_res <- 
  tune_grid(
    knn_model_spec,
    mpg ~ .,
    bootstraps(mtcars, 5),
    control = control_grid(save_workflow = TRUE)
  )

# check out the resulting object
tuning_res
#> # Tuning results
#> # Bootstrap sampling 
#> # A tibble: 5 × 4
#>   splits          id         .metrics          .notes          
#>   <list>          <chr>      <list>            <list>          
#> 1 <split [32/11]> Bootstrap1 <tibble [20 × 5]> <tibble [0 × 3]>
#> 2 <split [32/12]> Bootstrap2 <tibble [20 × 5]> <tibble [0 × 3]>
#> 3 <split [32/11]> Bootstrap3 <tibble [20 × 5]> <tibble [0 × 3]>
#> 4 <split [32/10]> Bootstrap4 <tibble [20 × 5]> <tibble [0 × 3]>
#> 5 <split [32/12]> Bootstrap5 <tibble [20 × 5]> <tibble [0 × 3]>

# examine proposed hyperparameters and associated metrics
collect_metrics(tuning_res)
#> # A tibble: 20 × 7
#>    neighbors .metric .estimator  mean     n std_err .config              
#>        <int> <chr>   <chr>      <dbl> <int>   <dbl> <chr>                
#>  1         2 rmse    standard   3.19      5  0.208  Preprocessor1_Model01
#>  2         2 rsq     standard   0.664     5  0.0861 Preprocessor1_Model01
#>  3         3 rmse    standard   3.13      5  0.266  Preprocessor1_Model02
#>  4         3 rsq     standard   0.678     5  0.0868 Preprocessor1_Model02
#>  5         4 rmse    standard   3.11      5  0.292  Preprocessor1_Model03
#>  6         4 rsq     standard   0.684     5  0.0851 Preprocessor1_Model03
#>  7         5 rmse    standard   3.10      5  0.287  Preprocessor1_Model04
#>  8         5 rsq     standard   0.686     5  0.0839 Preprocessor1_Model04
#>  9         8 rmse    standard   3.08      5  0.263  Preprocessor1_Model05
#> 10         8 rsq     standard   0.689     5  0.0843 Preprocessor1_Model05
#> 11         9 rmse    standard   3.07      5  0.256  Preprocessor1_Model06
#> 12         9 rsq     standard   0.691     5  0.0845 Preprocessor1_Model06
#> 13        10 rmse    standard   3.06      5  0.247  Preprocessor1_Model07
#> 14        10 rsq     standard   0.693     5  0.0837 Preprocessor1_Model07
#> 15        11 rmse    standard   3.05      5  0.241  Preprocessor1_Model08
#> 16        11 rsq     standard   0.696     5  0.0833 Preprocessor1_Model08
#> 17        13 rmse    standard   3.03      5  0.236  Preprocessor1_Model09
#> 18        13 rsq     standard   0.701     5  0.0820 Preprocessor1_Model09
#> 19        14 rmse    standard   3.02      5  0.235  Preprocessor1_Model10
#> 20        14 rsq     standard   0.704     5  0.0808 Preprocessor1_Model10

Given these tuning results, the next steps are to choose the “best” hyperparameters, assign those hyperparameters to the model, and fit the finalized model on the training set. Previously in tidymodels, this has felt like:

# choose a method to define "best" and extract the resulting parameters
best_param <- select_best(tuning_res, "rmse") 

# assign those parameters to model
knn_model_final <- finalize_model(knn_model_spec, best_param)

# fit the finalized model to the training set
knn_fit <- fit(knn_model_final, mpg ~ ., mtcars)

Voilà! knn_fit is a properly resampled model that is ready to predict() on new data:

predict(knn_fit, mtcars[1, ])
#> # A tibble: 1 × 1
#>   .pred
#>   <dbl>
#> 1  22.0

The newest release of tune introduced a shorthand interface for going from tuning results to final fit called fit_best(). The function wraps each of those three functions with sensible defaults to abbreviate the process described above.

knn_fit_2 <- fit_best(tuning_res)

predict(knn_fit_2, mtcars[1, ])
#> # A tibble: 1 × 1
#>   .pred
#>   <dbl>
#> 1  22.0

This function is closely related to the last_fit() function. They both give you access to a workflow fitted on the training data but are situated somewhat differently in the modeling workflow. fit_best() picks up after a tuning function like tune_grid() to take you from tuning results to fitted workflow, ready for you to predict and assess further. last_fit() assumes you have made your choice of hyperparameters and finalized your workflow to then take you from finalized workflow to fitted workflow and further to performance assessment on the test data. While fit_best() gives a fitted workflow, last_fit() gives you the performance results. If you want the fitted workflow, you can extract it from the result of last_fit() via extract_workflow().

The newest release of the workflowsets package also includes a fit_best() method for workflow set objects. Given a set of tuning results, that method will sift through all of the possible models to find and fit the optimal model configuration.

Interactive issue logging

Imagine, in the previous example, we made some subtle error in specifying the tuning process. For example, passing a function to extract elements of the proposed workflows that injects some warnings and errors into the tuning process:

raise_concerns <- function(x) {
  warning("Ummm, wait. :o")
  stop("Eep! Nooo!")
}

tuning_res <-
  tune_grid(
    knn_model_spec,
    mpg ~ .,
    bootstraps(mtcars, 5),
    control = control_grid(extract = raise_concerns)
  )

Warnings and errors can come up in all sorts of places while tuning hyperparameters. Often, with obvious issues, we can raise errors early on and halt the tuning process, but with more subtle concerns, we don’t want to be too restrictive; it’s sometimes better to defer to the underlying modeling packages to decide what’s a dire issue versus something that can be worked around.

In the past, we’ve raised warnings and issues as they occur, printing context on the issue to the console before logging the issue in the tuning result. In the above example, this would look like:

! Bootstrap1: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap1: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap2: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap2: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap3: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap3: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap4: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap4: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!
! Bootstrap5: preprocessor 1/1, model 1/1 (extracts): Ummm, wait. :o
x Bootstrap5: preprocessor 1/1, model 1/1 (extracts): Error in extractor(object): Eep! Nooo!

The above messages are super descriptive about where issues occur—they note in which resample, from which proposed modeling workflow, and in which part of the fitting process the issues occurred in. At the same time, they are quite repetitive; if there’s an issue during hyperparameter tuning, it probably occurs in every resample, always in the same place. If, instead, we were evaluating this model against 1,000 resamples, or there were more than just two issues, this output could get very overwhelming very quickly.

The new releases of our tuning packages include tools to determine which tuning issues are unique, and for each unique issue, only print out the message once while maintaining a dynamic count of how many times the issue occurred. With the new tune release, the same output would look like:

#> → A | warning: Ummm, wait. :o
#> → B | error:   Eep! Nooo!
#> There were issues with some computations   A: x5   B: x5

This interface is hopefully less overwhelming for users. When the messages attached to these issues aren’t enough to debug the issue, the complete set of information about the issues lives inside of the tuning result object, and can be retrieved with collect_notes(tuning_res). To turn off the interactive logging, set the verbose control option to TRUE.

Speedups

Each of these three releases, as well as releases of core tidymodels packages they depend on like parsnip, recipes, and hardhat, include a plethora of changes meant to optimize computational performance. Especially for modeling practitioners who work with many resamples and/or small data sets, our modeling workflows will feel a whole lot snappier:

A ggplot2 line graph plotting relative change in time to evaluate model fits with the tidymodels packages. Fits on datasets with 100 training rows are 2 to 3 times faster, while fits on datasets with 100,000 or more rows take about the same amount of time as they used to.

With 100-row training data sets, the time to resample models with tune and friends has been at least halved. These releases are the first iteration of a set of changes to reduce the evaluation time of tidymodels code, and users can expect further optimizations in coming releases! See this post on my blog for more information about those speedups.

Bonus points

Although they’re smaller in scope, we wanted to highlight two additional developments in tuning hyperparameters with tidymodels.

Workflow set support for tidyclust

The recent tidymodels package tidyclust introduced support for fitting and tuning clustering models in tidymodels. That package’s function tune_cluster() is now an option for tuning in workflow_map(), meaning that users can fit sets of clustering models and preprocessors using workflow sets. These changes further integrate the tidyclust package into tidymodels framework.

Refined retrieval of intermediate results

The .Last.tune.result helper stores the most recent tuning result in the object .Last.tune.result as a fail-safe in cases of interrupted tuning, uncaught tuning errors, and simply forgetting to assign tuning results to an object.

# be a silly goose and forget to assign results
tune_grid(
  knn_model_spec,
  mpg ~ .,
  bootstraps(mtcars, 5),
  control = control_grid(save_workflow = TRUE)
)
#> # Tuning results
#> # Bootstrap sampling 
#> # A tibble: 5 × 4
#>   splits          id         .metrics          .notes          
#>   <list>          <chr>      <list>            <list>          
#> 1 <split [32/11]> Bootstrap1 <tibble [18 × 5]> <tibble [0 × 3]>
#> 2 <split [32/14]> Bootstrap2 <tibble [18 × 5]> <tibble [0 × 3]>
#> 3 <split [32/13]> Bootstrap3 <tibble [18 × 5]> <tibble [0 × 3]>
#> 4 <split [32/12]> Bootstrap4 <tibble [18 × 5]> <tibble [0 × 3]>
#> 5 <split [32/11]> Bootstrap5 <tibble [18 × 5]> <tibble [0 × 3]>

# all is not lost!
.Last.tune.result
#> # Tuning results
#> # Bootstrap sampling 
#> # A tibble: 5 × 4
#>   splits          id         .metrics          .notes          
#>   <list>          <chr>      <list>            <list>          
#> 1 <split [32/11]> Bootstrap1 <tibble [18 × 5]> <tibble [0 × 3]>
#> 2 <split [32/14]> Bootstrap2 <tibble [18 × 5]> <tibble [0 × 3]>
#> 3 <split [32/13]> Bootstrap3 <tibble [18 × 5]> <tibble [0 × 3]>
#> 4 <split [32/12]> Bootstrap4 <tibble [18 × 5]> <tibble [0 × 3]>
#> 5 <split [32/11]> Bootstrap5 <tibble [18 × 5]> <tibble [0 × 3]>

# assign to object after the fact
res <- .Last.tune.result

These three releases introduce support for the .Last.tune.result object in more settings and refine support in existing implementations.

Acknowledgements

Thanks to @walrossker, @Freestyleyang, and @Jeffrothschild for their contributions to these packages since their last releases.

Happy modeling, y’all!