workflows 0.2.0

We’re excited to announce the release of workflows 0.2.0. workflows is a tidymodels package for bundling a model specification from parsnip with a preprocessor, such as a formula or recipe. Doing this can streamline the model fitting workflow and combines nicely with tune for performing hyperparameter tuning.

You can install it from CRAN with:

install.packages("workflows")

library(workflows)
library(parsnip)

Adding variables to a workflow

The main change in this release of workflows is the introduction of a new preprocessor method: add_variables(). This adds a third method to specify model terms, in addition to add_formula() and add_recipe().

add_variables() has a tidyselect interface, where outcomes are specified using bare column names, followed by predictors.

linear_spec <- linear_reg() %>%
  set_engine("lm")

wf <- workflow() %>%
  add_model(linear_spec) %>%
  add_variables(outcomes = mpg, predictors = c(cyl, disp))

wf

#> ══ Workflow ════════════════════════════════════════════════════════════════════
#> Preprocessor: Variables
#> Model: linear_reg()
#> 
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Outcomes: mpg
#> Predictors: c(cyl, disp)
#> 
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Linear Regression Model Specification (regression)
#> 
#> Computational engine: lm

model <- fit(wf, mtcars)
mold <- pull_workflow_mold(model)

mold$predictors

#> # A tibble: 32 x 2
#>      cyl  disp
#>    <dbl> <dbl>
#>  1     6  160 
#>  2     6  160 
#>  3     4  108 
#>  4     6  258 
#>  5     8  360 
#>  6     6  225 
#>  7     8  360 
#>  8     4  147.
#>  9     4  141.
#> 10     6  168.
#> # … with 22 more rows

mold$outcomes

#> # A tibble: 32 x 1
#>      mpg
#>    <dbl>
#>  1  21  
#>  2  21  
#>  3  22.8
#>  4  21.4
#>  5  18.7
#>  6  18.1
#>  7  14.3
#>  8  24.4
#>  9  22.8
#> 10  19.2
#> # … with 22 more rows

outcomes are removed before predictors is evaluated, which means that formula specifications like y ~ . can be easily reproduced as:

workflow() %>%
  add_variables(mpg, everything())

#> ══ Workflow ════════════════════════════════════════════════════════════════════
#> Preprocessor: Variables
#> Model: None
#> 
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Outcomes: mpg
#> Predictors: everything()

Importantly, add_variables() doesn’t do any preprocessing to your columns whatsoever. This is in contrast to add_formula(), which uses the standard model.matrix() machinery from R, and add_recipe(), which will recipes::prep() the recipe for you. It is especially useful when you aren’t using a recipe, but you do have S3 columns that you don’t want run through model.matrix() for fear of losing the S3 class, like with Date columns.

library(modeltime)

arima_spec <- arima_reg() %>%
    set_engine("arima")

df <- data.frame(
  y = sample(5),
  date = as.Date("2019-01-01") + 0:4
)

wf <- workflow() %>%
  add_variables(y, date) %>%
  add_model(arima_spec)

arima_model <- fit(wf, df)

#> frequency = 1 observations per 1 day


arima_model

#> ══ Workflow [trained] ══════════════════════════════════════════════════════════
#> Preprocessor: Variables
#> Model: arima_reg()
#> 
#> ── Preprocessor ────────────────────────────────────────────────────────────────
#> Outcomes: y
#> Predictors: date
#> 
#> ── Model ───────────────────────────────────────────────────────────────────────
#> Series: outcome 
#> ARIMA(0,0,0) with non-zero mean 
#> 
#> Coefficients:
#>         mean
#>       3.0000
#> s.e.  0.6325
#> 
#> sigma^2 estimated as 2.5:  log likelihood=-8.83
#> AIC=21.66   AICc=27.66   BIC=20.87

mold <- pull_workflow_mold(arima_model)
mold$predictors

#> # A tibble: 5 x 1
#>   date      
#>   <date>    
#> 1 2019-01-01
#> 2 2019-01-02
#> 3 2019-01-03
#> 4 2019-01-04
#> 5 2019-01-05

Tune

workflows created with add_variables() do not work with the current CRAN version of tune (0.1.1). However, the development version of tune does have support for this, which you can install in the meantime until a new version of tune hits CRAN.

devtools::install_github("tidymodels/tune")

Acknowledgements

Thanks to the three contributors that helped with this version of workflows @EmilHvitfeldt, @mdancho84, and @RaviHela!