lubridate 1.7.0

  tidyverse, lubridate

  Vitalie Spinu

We are pleased to announce that lubridate 1.7.0 is now on CRAN! For a complete set of changes please see the release NEWS.

Lubridate is a package that makes working with date-time and time-span objects easier. It provides fast and user friendly parsing of date-time strings, extraction and updating of components of a date-time objects (years, months, days etc.) and algebraic manipulation on date-time and time-span objects.

Here is a brief walk-through of the prominent new features in 1.7.0.

Built-in CCTZ and much faster update and force_tz

From this version lubridate relies on google’s CCTZ library for the date-time updates and time-zone manipulation. This means that force_tz, update, rounding and a range of arithmetic operations on timespans are now considerably faster.

Vectorized time-zone manipulation

Often date-time data comes with heterogeneous time-zones. For example, you might have a character vector of date-times and a separate vector of time-zones. In such cases you can parse your date-time vector in one time zone (UTC) and then “enforce” the heterogeneous time-zones with force_tzs.

x <- ymd_hms(c("2009-08-01 00:00:00", "2009-08-01 00:00:00"))
tzs <- c("America/New_York", "Europe/Amsterdam")

force_tzs(x, tzones = tzs)
##=> [1] "2009-08-01 04:00:00 UTC" "2009-07-31 22:00:00 UTC"

force_tzs(x, tzones = tzs, tzone_out = "America/New_York")
##=> [1] "2009-08-01 00:00:00 EDT" "2009-07-31 18:00:00 EDT"

Note that first force_tzs call produced a vector of instants in UTC time-zone, for the second call we specified the desired time-zone of the output vector with the tzone_out argument. This is needed as R’s date-time vectors cannot represent heterogeneous time-zones. For the same reason, the counterpart of with_tz, with_tzs, does not exist. Instead, local_time should cover the with_tzs use case in most situations.

New local_time function retrieves day clock time of the input vector in the specified time zones. It returns a difftime object and it is vectorized over both date-time and time-zone arguments:

x <- ymd_hms(c("2009-08-01 01:02:03", "2009-08-01 10:20:30"))
local_time(x, units = "hours")
##=> Time differences in hours
##=> [1]  1.034167 10.341667

x <- ymd_hms(c("2009-08-01 00:00:00", "2009-08-01 00:00:00"))
tzs <- c("America/New_York", "Europe/Amsterdam")
local_time(x, tzs, units = "hours")
##=> Time differences in hours
##=> [1] 20  2

New cutoff_2000 parameter in parsing functions

Lower level lubridate parser functions (parse_datetime2,fast_strptime) now accept cutoff_2000 parameter to determine when the parsing of yy format should output 19th or 20th century.

x <- c("50-01-01", "70-01-01")
parse_date_time2(x, "ymd")
##=> [1] "2050-01-01 UTC" "1970-01-01 UTC"
parse_date_time2(x, "ymd", cutoff_2000 = 30)
##=> [1] "1950-01-01 UTC" "1970-01-01 UTC"

By default cutoff_2000 is 68 to comply with R’s strptime function.

This feature was not propagated to higher level parsing function because base strptime, on which those occasionally rely, doesn’t support this option.

Functions wday and month are now localized

When label=TRUE wday and month extractors return localized strings:

Sys.setlocale(locale = "zh_CN.utf8")
##=> [1] "LC_CTYPE=zh_CN.utf8;LC_NUMERIC=C;LC_TIME=zh_CN.utf8;LC_COLLATE=zh_CN.utf8;LC_MONETARY=zh_CN.utf8;LC_MESSAGES=C;LC_PAPER=en_GB.UTF-8;LC_NAME=C;LC_ADDRESS=C;LC_TELEPHONE=C;LC_MEASUREMENT=en_GB.UTF-8;LC_IDENTIFICATION=C"
wday(now(), label = TRUE)
##=> [1] 三
##=> Levels: 日 < 一 < 二 < 三 < 四 < 五 < 六
month(now(), label = TRUE)
##=> [1] 11月
##=> 12 Levels: 1月 < 2月 < 3月 < 4月 < 5月 < 6月 < 7月 < 8月 < ... < 12月

Please note that for this reason using labels for comparison is not recommended. You should use numeric output instead. Otherwise your code might not work on other computers.

Several functions gained the week_start parameter

Functions for which depend on the week-start conventions (wday, wday<-, floor_date, ceiling_date and round_date) now accept week_start argument which defaults to getOption("lubridate.week.start", 7).

x <- today()
wday(x, label = TRUE, abbr = FALSE)
##=> [1] 星期三
##=> 7 Levels: 星期日 < 星期一 < 星期二 < 星期三 < 星期四 < ... < 星期六
wday(x)
##=> [1] 4
wday(x,  week_start = 2)
##=> [1] 2

Rounding to fraction of a second

R’s date-time format (POSIXct) is accurate up to microseconds, thus it makes sense to round to fractions of a second. Unfortunately R currently prints fractional seconds incorrectly which can lead to confusion:

## print fractional seconds
options(digits.secs=6, digits = 13)

x <- ymd_hms("2009-08-03 12:01:59.031")
rx <- round_date(x, ".01sec")
cx <- ceiling_date(x, ".01sec")

c(rx, cx)
##=> [1] "2009-08-03 14:01:59.02 CEST" "2009-08-03 14:01:59.03 CEST"
as.double(c(rx, cx))
##=> [1] 1249300919.03 1249300919.04

So, fractional rounding and ceiling work, but don’t be taken aback by incorrect printing;)

New epiweek and epiyear functions

New functions epiyear and epiweek are the US CDC version of epidemiological weeks and years. They follow same rules as isoweek and isoyear but with the week starting on Sunday. In other parts of the world the convention is to start epidemiological weeks on Monday, which is then the same as for isoweek and isoyear.