Note: The following has been adapted from a section of the forthcoming second edition of R for Data Science that had to be removed due to length limitations.
Pipes
R 4.1.0 introduced a native pipe operator, |>
. As described in the
R News:
R now provides a simple native forward pipe syntax
|>
. The simple form of the forward pipe inserts the left-hand side as the first argument in the right-hand side call. The pipe implementation as a syntax transformation was motivated by suggestions from Jim Hester and Lionel Henry.
The behaviour of the native pipe is by and large the same as that of the
%>%
pipe provided by the magrittr package. Both operators (|>
and %>%
) let you “pipe” an object forward to a function or call expression, thereby allowing you to express a sequence of operations that transform an object.
To learn more about the basic utility of pipes, see The pipe section of R for Data Science.
Luckily there’s no need to commit entirely to one pipe or the other — you can use the base pipe for the majority of cases where it’s sufficient and use the magrittr pipe when you really need its special features.
|>
vs. %>%
While |>
and %>%
behave identically for simple cases, there are a few crucial differences. These are most likely to affect you if you’re a long-term user of %>%
who has taken advantage of some of the more advanced features. But they’re still good to know about even if you’ve never used %>%
because you’re likely to encounter some of them when reading wild-caught code.
-
By default, the pipe passes the object on its left-hand side to the first argument of the function on the right-hand side.
%>%
allows you to change the placement with a.
placeholder. For example,x %>% f(1)
is equivalent tof(x, 1)
butx %>% f(1, .)
is equivalent tof(1, x)
. R 4.2.0 added a_
placeholder to the base pipe, with one additional restriction: the argument has to be named. For example,x |> f(1, y = _)
is equivalent tof(1, y = x)
. -
The
|>
placeholder is deliberately simple and can’t replicate many features of the%>%
placeholder: you can’t pass it to multiple arguments, and it doesn’t have any special behavior when the placeholder is used inside another function. For example,df %>% split(.$var)
is equivalent tosplit(df, df$var)
, anddf %>% {split(.$x, .$y)}
is equivalent tosplit(df$x, df$y)
.With
%>%
, you can use.
on the left-hand side of operators like$
,[[
,[
, so you can extract a single column from a data frame with (e.g.)mtcars %>% .$cyl
. R added support for this feature in R 4.3.0. For the special case of extracting a column out of a data frame, you can also usedplyr::pull()
:mtcars |> pull(cyl)
-
%>%
allows you to drop the parentheses when calling a function with no other arguments;|>
always requires the parentheses. -
%>%
allows you to start a pipe with.
to create a function rather than immediately executing the pipe; this is not supported by the base pipe.
Using the native pipe in packages
Because the native pipe wasn’t introduced until 4.1.0, code using |>
in function reference examples or vignettes will not work on older versions of R, as it is not valid syntax. This is a problem for the tidyverse because our
versioning policies mean that our packages need to work on R 3.5.0 and later.
Does this mean that you need to increase the minimum R version your package depends on in order to use |>
? Not necessarily: there are two techniques we can use to keep vignettes and examples working.
For example, the base pipe is used in purrr 1.0.0. As can be seen in the
source for the “purrr <-> base R” vignette, certain code chunks are evaluated conditionally based on the version of R being used. The setup chunk for the vignette includes: modern_r <- getRversion() >= "4.1.0"
. The results of this are then used in the eval
argument to determine whether or not a code chunk that relies on “modern R” syntax should be run.
The other place we use the base pipe is in examples. To disable these we use a bit of a hack that requires three files
configure
,
cleanup
, and
tools/examples.R
. The basic idea is for pre-R 4.1.0 we re-define the \examples{}
tag to display an informative message but not run the code; this ensures that R CMD check
continues to work even on older versions of R.