readr 1.4.0 is now available on CRAN! Learn more about readr at https://readr.tidyverse.org. Detailed notes are always in the change log.
The readr package makes it easy to get rectangular data out of comma separated (csv), tab separated (tsv) or fixed width files (fwf) and into R. It is designed to flexibly parse many types of data found in the wild, while still cleanly failing when data unexpectedly changes. If you are new to readr, the best place to start is the data import chapter in R for data science.
Install readr with
install.packages("readr")
And load it with
library(tidyverse)
#> ── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
#> ✔ ggplot2 3.3.2 ✔ purrr 0.3.4
#> ✔ tibble 3.0.3 ✔ dplyr 1.0.2
#> ✔ tidyr 1.1.2 ✔ stringr 1.4.0
#> ✔ readr 1.4.0 ✔ forcats 0.5.0
#> ── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
#> ✖ dplyr::filter() masks stats::filter()
#> ✖ dplyr::lag() masks stats::lag()
Breaking Changes
Argument name consistency
The first argument to all of the write_()
functions, like write_csv()
had previously been path
. However the first argument to all of the read_()
functions is file
. As of readr 1.4.0 the first argument to both read_()
and write_()
functions is file
and path
is now deprecated.
NaN behavior
Some floating point operations can produce a NaN
value, e.g. 0 / 0
. Previously write_csv()
would output NaN
values always as NaN
and this could not be controlled by the write_csv(na=)
argument. Now the output value of NaN
is the same as the NA
and can be controlled by the argument. This is a breaking change in that the same code would produce different output, but it should be rare in practice.
New features
Generate column specifications from datasets
Using as.col_spec()
on any data.frame
or tibble
object will now generate a column specification with the column types in the data.
library(palmerpenguins)
spec <- as.col_spec(penguins)
spec
#> cols(
#> species = col_factor(levels = c("Adelie", "Chinstrap", "Gentoo"), ordered = FALSE, include_na = FALSE),
#> island = col_factor(levels = c("Biscoe", "Dream", "Torgersen"), ordered = FALSE, include_na = FALSE),
#> bill_length_mm = col_double(),
#> bill_depth_mm = col_double(),
#> flipper_length_mm = col_integer(),
#> body_mass_g = col_integer(),
#> sex = col_factor(levels = c("female", "male"), ordered = FALSE, include_na = FALSE),
#> year = col_integer()
#> )
You can also convert the column specifications to a condensed textual representation with
as.character()
as.character(spec)
#> [1] "ffddiifi"
Writing end of line characters
Write functions now take a eol
argument to allow control of the end of line characters. Previously readr only supported using a single newline (\n
) character. You can now specify any number of characters, though windows linefeed newline (\r\n
) is by far the most common alternative.
cli package is now used for messages
The cli package is now used for messages. The most prominent place you will notice this is printing the column specifications. Previously these functions used
message()
, which in RStudio prints the text in red.
While cli still uses message objects, they will now be more naturally colored, which hopefully will make them easier to read.
Rcpp dependency removed
The Rcpp dependency has been removed in favor of cpp11. Compiling readr should now take less time and use less memory.
Acknowledgements
As usual, there were many more additional changes and bugfixes included in this release see the change log for details.
Thank you to the 132 contributors who made this release possible by opening issues or submitting pull requests: @adamroyjones, @aetiologicCanada, @ailich, @antoine-sachet, @archenemies, @ashuchawla, @Athanasiamo, @bastianilso, @batpigandme, @Ben-Cox, @bergen288, @boshek, @bovender, @bransonf, @brianrice2, @briatte, @c30saux, @cboettig, @cderv, @cdhowe, @ceresek, @charliejhadley, @chipkoziara, @cwolk, @damianooldoni, @dan-reznik, @DanielleQuinn, @DarwinAwardWinner, @dhmontgomery, @djbirke, @dkahle, @dmitrienka, @dmurdoch, @dpprdan, @dwachsmuth, @EarlGlynn, @edo91, @ellessenne, @Fernal73, @firasm, @fjuniorr, @frahimov, @frousseu, @GegznaV, @georgevbsantiago, @geotheory, @greg-minshall, @hadley, @hidekoji, @huashan, @ifendo, @ijlyttle, @isaactpetersen, @jangorecki, @jdblischak, @jemunro, @jennahamlin, @jesse-ross, @jimhester, @jmarshallnz, @jmcloughlin, @jmobrien, @jnolis, @jokedurnez, @jpwhitney, @jssa98, @juangomezduaso, @junqi108, @JustGitting, @jxu, @kainhofer, @katgit, @kbzsl, @keesdeschepper, @kiernann, @knausb, @krlmlr, @kvittingseerup, @lambdamoses, @leopoldsw, @lsaravia, @MihaiBabiac, @mkearney, @mlaunois, @mmuurr, @moodymudskipper, @MZellou, @nacnudus, @natecobb, @NFA, @NikKrieger, @njtierney, @nogeel, @orderlyquant, @oscci, @Ozan147, @pcgreen7, @perog, @phil-grayson, @pralitp, @psychelzh, @QuLogic, @r2evans, @Rajesh-Ramasamy, @ralsouza, @rcragun, @romainfrancois, @salim-b, @sfrenk, @Shians, @shrektan, @skaltman, @sonhan18, @StevenMMortimer, @thays42, @ThePrez, @tmalsburg, @TrentLobdell, @ttimbers, @vnijs, @wch, @we-hop, @wehopkins, @wibeasley, @wolski, @wwgordon, @xianwenchen, @xiaodaigh, @xinyue-li, @yutannihilation, @Zack-83, and @zenggyu.