We are pleased to announce that forcats 0.4.0 is now on CRAN. The forcats package provides a suite of useful tools that solve common problems with factors in R. This version benefited from the hard work of contributors new and old at our first tidyverse dev day. For a complete set of changes, please see the release notes.
To install the latest version, run:
install.packages("forcats")
As always, attach the package with:
library(forcats)
New functions
fct_cross()
creates a new factor containing the combined levels from two or more input factors, similar to base::interaction()
.
fruit <- factor(c("apple", "kiwi", "apple", "apple"))
colour <- factor(c("green", "green", "red", "green"))
fct_cross(fruit, colour)
#> [1] apple:green kiwi:green apple:red apple:green
#> Levels: apple:green apple:red kiwi:green
fct_lump_min()
preserves levels that appear at least min
times (can also be used with the w
weighted argument).
x <- factor(letters[rpois(50, 3)])
fct_lump_min(x, min = 10)
#> [1] Other b Other b Other Other Other b Other Other b
#> [12] Other Other Other b Other b Other Other b b Other
#> [23] Other Other b b Other Other Other Other Other b Other
#> [34] Other Other b Other Other Other Other Other Other Other Other
#> [45] Other b Other b
#> Levels: b Other
fct_match()
tests for the presence of levels in a factor, providing a safer alternative to %in%
by throwing an error when there are unexpected levels.
table(fct_match(gss_cat$marital, c("Married", "Divorced")))
#>
#> FALSE TRUE
#> 7983 13500
table(gss_cat$marital %in% c("Maried", "Davorced"))
#>
#> FALSE
#> 21483
table(fct_match(gss_cat$marital, c("Maried", "Davorced")))
#> Error: Levels not present in factor: "Maried", "Davorced"
Other improvements
-
fct_relevel()
can now relevel factors using a function that is passed the current levels.f <- factor(c("a", "b", "c", "d"), levels = c("b", "c", "d", "a")) fct_relevel(f, sort) #> [1] a b c d #> Levels: a b c d fct_relevel(f, rev) #> [1] a b c d #> Levels: a d c b
-
as_factor()
now has a numeric method which orders factors in numeric order, unlike the other methods which default to order of appearance.y <- c("1.1", "11", "2.2", "22") as_factor(y) #> [1] 1.1 11 2.2 22 #> Levels: 1.1 11 2.2 22 z <- as.numeric(y) as_factor(z) #> [1] 1.1 11 2.2 22 #> Levels: 1.1 2.2 11 22
-
fct_inseq()
reorders labels numerically, when possible.
Thanks to Emily Robinson, forcats also has a new introductory vignette.
Acknowledgements
We’re grateful for the 35 people who contributed to this release: @ahaque-utd, @AmeliaMN, @ashiklom, @batpigandme, @billdenney, @brianwdavis, @corybrunson, @dalewsteele, @ewenharrison, @grayskripko, @gtm19, @hack-r, @hadley, @huftis, @isteves, @jimhester, @jonocarroll, @jrosen48, @jthomasmock, @kbodwin, @mdjeric, @orchid00, @richierocks, @robinsones, @rosedu1, @RoyalTS, @russHyde, @Ryo-N7, @s-fleck, @seaaan, @spedygiorgio, @tslumley, @xuhuizhang, @zhiiiyang, and @zx8754.