We’re happy to announce the release of processx 3.9.0. processx is an R package to run and manage system processes.
You can install it from CRAN with:
install.packages("processx")This blog post discusses the major new features in processx 3.9.0. You can see a full list of changes in the release notes.
Pipelines
New new pipeline class lets you connect two or more processes with kernel-level pipes, exactly like a Unix shell pipeline (cmd1 | cmd2 | cmd3): data flows directly between child processes without passing through R.
pl <- pipeline$new(
list(c("sort"), c("uniq", "-c"), c("sort", "-rn")),
stdin = "|",
stdout = "|"
)
pl$write_input("banana\napple\nbanana\norange\napple\nbanana\n")
pl$close_input()
#> NULL
pl$read_all_output_lines()
#> [1] " 3 banana" " 2 apple" " 1 orange"
pl$wait()
pl$get_exit_statuses()
#> [[1]]
#> [1] 0
#>
#> [[2]]
#> [1] 0
#>
#> [[3]]
#> [1] 0
The pipeline$new() constructor takes a list of character vectors — one per command — along with the usual stdin, stdout, and stderr arguments. These apply to the ends of the pipeline: stdin connects to the first process, stdout reads from the last, and stderr controls all processes.
The key benefit over calling
run() in sequence is efficiency: intermediate data never materialises in R. A pipeline processing gigabytes of log lines uses the same small kernel buffers as a shell pipeline would.
Because each step in the pipeline is a regular process object under the hood, you can access individual processes via $get_processes() — useful for reading per-process stderr or checking exit codes when a stage fails.
pipeline works on Unix and Windows and is currently experimental: the API may still change.
Pseudo-terminal support
processx::run(pty = TRUE)
Many command-line tools behave differently when their output is not connected to a terminal: they disable colour, turn off progress bars, or buffer output more aggressively. The pty = TRUE option runs a process inside a pseudo-terminal so it sees a real terminal — colour and interactive behaviour included.
run() now supports pty = TRUE directly:
out <- run("ls", c("--color", path.expand("~/works/processx")), pty = TRUE)
cat(out$stdout)
#> DESCRIPTION NAMESPACE README.md inst tests
#> LICENSE NEWS.md _pkgdown.yml man tools
#> LICENSE.md R air.toml processx.Rproj vignettes
#> Makefile README.Rmd codecov.yml src
When pty = TRUE, stderr is merged into stdout (the result’s $stderr is always NULL), because a PTY has a single stream. You can also supply a file path as stdin; its contents are fed to the process via the PTY master, followed by an EOF signal.
Windows support
processx 3.9.0 adds support for pseudo-terminals (PTYs) on Windows, starting from Windows 10 version 1809. The Windows implementation uses the ConPTY API (CreatePseudoConsole), loaded dynamically so processx continues to load on older Windows and emits a clear error if pty = TRUE is requested on an unsupported version.
Other improvements
New process cleanup article
A new article, Process cleanup, documents all five mechanisms processx provides for ensuring subprocesses don’t outlive their intended scope:
- Explicit cleanup with
on.exit()— always deterministic. - Automatic cleanup on garbage collection (
cleanup = TRUE, the default). - Process-tree cleanup (
cleanup_tree = TRUE). - Linux parent-death signal (
linux_pdeathsig) — Linux only, handles R crashes. - Supervisor process (
supervise = TRUE) — all platforms, handles R crashes.
Death signal support on Linux
On Linux, you can now tell the kernel to deliver a signal to the child process automatically if the parent R process exits — even if R crashes. Set linux_pdeathsig = TRUE to send SIGTERM, or pass an integer signal number directly:
p <- process$new("sleep", "100", linux_pdeathsig = TRUE)
This is useful when you want child processes to clean up after an R crash, without the overhead of running a supervisor. The argument is silently ignored on macOS and Windows.
Record the time when a process exits
process$get_end_time() returns the time when the process exited as a POSIXct, or NULL if it is still running. This makes it straightforward to measure wall-clock duration without having to record timestamps yourself:
p <- process$new("sleep", "1")
p$wait()
p$get_end_time() - p$get_start_time()
#> Time difference of 1.010295 secs
Append stdout/stderr to files
process$new() and
run() now support ">>" as a prefix for stdout and stderr file paths to append output instead of truncating the file:
log <- tempfile()
run("echo", args = "first line", stdout = log)
#> $status
#> [1] 0
#>
#> $stdout
#> NULL
#>
#> $stderr
#> [1] ""
#>
#> $timeout
#> [1] FALSE
run("echo", args = "second line", stdout = paste0(">>", log))
#> $status
#> [1] 0
#>
#> $stdout
#> NULL
#>
#> $stderr
#> [1] ""
#>
#> $timeout
#> [1] FALSE
readLines(log)
#> [1] "first line" "second line"
This is handy when you run the same process repeatedly and want to accumulate output in a single log file.
Binary standard output and error
run() and process$new() now support encoding = "binary" to capture raw bytes. In binary mode,
run() returns stdout and stderr as raw vectors, and process$read_output() / process$read_error() return raw vectors rather than character strings. All bytes are preserved exactly, including null bytes and non-UTF-8 sequences.
result <- run("cat", args = "/bin/ls", encoding = "binary")
typeof(result$stdout)
#> [1] "raw"
length(result$stdout)
#> [1] 154624
Two new methods, process$read_output_bytes() and process$read_error_bytes(), and the
conn_read_bytes() function, provide direct access to raw bytes from processx connections.
Acknowledgements
Thanks to everyone who contributed to processx 3.9.0 through code, issues, testing, and feedback:
@advieser, @cderv, @chwpearse, @HenrikBengtsson, @king-of-poppk, @r2evans, @sckott, @sda030, @stupidpupil, and @Yunuuuu.