plumber2 0.2.0

We’re stoked to announce the release of plumber2 0.2.0. plumber2 is a package for creating webservers in R based on either an annotation-based or programmatic workflow. It is the successor to the plumber package who has empowered the R community for 10 years and allowed them to share their R based functionalities with their organizations and the world.

You can install it from CRAN with:

pak::pak("plumber2")

This release covers both a bunch of new features as well as some tangible improvements to performance. The headlining features are OpenTelemetry (OTEL) support and support for authentication which we will dive into below. In the end we will also provide a grab-bag of miscellaneous improvements for your enjoyment.

You can see a full list of changes in the release notes

library(plumber2)

OTEL support

We have been hard at work at adding support for OpenTelemetry (OTEL) for our tools to allow easy instrumentation across our offerings, see e.g. the shiny blog post announcing support for it there. If you do not know what OTEL is, here is a short introduction to the subject:

OTEL describes itself as “high-quality, ubiquitous, and portable telemetry to enable effective observability”. In simpler terms, OpenTelemetry is a set of tools, APIs, and SDKs that help you collect and export telemetry data (like traces, logs, and metrics) from your applications. This data provides insights into how your applications are performing and behaving in real-world scenarios.

It captures three key types of data:

Traces: These show the path of a request through your application.
Logs: These are detailed event records that capture what happened at specific moments.
Metrics: These are numerical measurements over time, like how many users are connected or how long outputs take to render.

These data types were standardized under the OTEL project, which is supported by a large community and many companies. The goal is to provide a consistent way to collect and export observability data, making it easier to monitor and troubleshoot applications.

OTEL is vendor-neutral, meaning you can send your telemetry data to various local backends like Jaeger, Zipkin, Prometheus, or cloud-based services like Grafana Cloud, Logfire, and Langfuse. This flexibility means you’re not locked into any particular monitoring solution.

While that may be somewhat of a mouthful the tldr; is that with OTEL you can capture what goes on in your application and use a variety of services to explore this data. This is great especially for code that is meant to be deployed and thus not readily available for introspection.

A great thing about OTEL is that traces are linked across applications. If you have multiple linked microservices based on plumber2, then you can follow a request trace as it travels between the different APIs. The same goes for a shiny app that calls into a plumber2 api or the other way around. As we build out support across our tools this benefit will only get more profound.

OTEL in plumber2

While OTEL is integrated into plumber2 it is not activated by default. To set it up you need the otel and otelsdk installed and configured:

pak::pak(c("otel", "otelsdk"))

Configuration is completely code free and based on environment variables. You can e.g. add the lines below to your .Renviron file to setup OTEL with Logfire

# Enable OpenTelemetry by setting Collector environment variables
OTEL_TRACES_EXPORTER=http
OTEL_LOGS_EXPORTER=http
OTEL_LOG_LEVEL=debug
OTEL_METRICS_EXPORTER=http

OTEL_EXPORTER_OTLP_ENDPOINT="https://logfire-us.pydantic.dev"
OTEL_EXPORTER_OTLP_HEADERS="Authorization=<your-write-token>"

You can verify that everything is set up by calling otel::is_tracing_enabled() which should return TRUE in that case.

OTEL has an extensive list of semantic conventions for telemetry of various domains so that information is captured in a standardised way. plumber2 adheres to the HTTP server conventions and supports all the required and most of the recommended trace attributes and metrics.

Within a plumber2 API, a trace span is started the moment a request is received. The span is populated with the following information:

http.request.method: The method of the request (e.g. GET, POST, etc)
url.path: The exact path requested
url.scheme: The protocol used for the request
http.route: The route pattern of the last of the route handlers the request went through
network.protocol.name: The internal protocol used. Always http
network.protocol.version: The version of the protocol. Always 1.1
server.port: The port the server is listening on. Can be used to distinguish multiple concurrent servers
url.query: The querystring of the request
client.address: The IP address the request comes from
server.address: The address the request was send to
user_agent.original: The user agent of the client sending the request
http.request.header.<header-name>: The value of header-name in the request. E.g. http.request.header.date will contain the value of the Date header

Once the request has been handled it will further append the following information:

http.response.status_code: The status code of the response
http.response.header.<header-name>: The value of header-name in the response. E.g. http.response.header.content-type will contain the value of the Content-Type header

In addition to the trace attributes above, a number of OTEL metrics are also recorded:

http.server.request.duration: The duration of the request handling from it is received to it is send back
http.server.active_requests: The number of active requests being handled at the given time
http.server.request.body.size: The size of the request body
http.server.response.body.size: The size of the response body

As a child of this parent span each handler in your API will also initiate a span with the following attributes:

routr.route: The path pattern of the handler. This will be recorded in the routr representation which uses :param instead of {param} format (e.g. users/:username instead of users/{username})
routr.path.param.<param-name>: The value of the param-name path parameter. E.g. a request for users/thomas will get a routr.path.param.username attribute with the value thomas for the route users/{username}.

Any span you initiate inside a handler will become a child of the handler span and through that be linked to the parent request span.

As you can see, the integration provides extensive information for you to use when figuring out what is going on in your application. On top of that, you can also use OTEL as your logging solution by setting logger_otel as your logging solution:

api() |> 
  api_logger(logger_otel)

This ensures that all the logs from errors, warnings, etc all end up in the same place as your other recordings and further gets linked to the exact request that gave rise to the log.

We truly believe extensive OTEL support across the ecosystem will be a game changer for deployed R code and we can’t wait for our users to take advantage of it!

Auth support

The second headliner is support for various authentication schemes out of the box. This comes courtesy of of the fireproof package which provides an auth plugin for fiery.

Setting up authentication is twofold: creating guards and attaching guards to routes.

First, you need to define one or more guards to use. A guard is an adaption of a specific authentication scheme such as e.g. OAuth. Currently, fireproof supports the Basic and Bearer HTTP authorization schemes, a custom key based scheme, as well as OAuth 2.0 and OpenID Connect. Setting up a guard can be done both programmatically and with annotations:

# Programmatic
api <- api() |> 
  api_auth_guard(
    guard = fireproof::guard_key(
      key_name = "X-API-KEY",
      validate = "MY_VERY_SECRET_KEY"
    ),
    name = "key_guard"
  )

# Annotation

#* @authGuard key_guard
fireproof::guard_key(
  key_name = "X-API-KEY",
  validate = "MY_VERY_SECRET_KEY"
)

Both of these pieces of code yields the same result. You API now has a guard registered under the name key_guard which will (if called upon) check a request for the existence of a cookie named X-API-KEY with the value MY_VERY_SECRET_KEY.

Secondly, your handlers can now integrate the guards to protect access to the requested path. Again, this can be done both programmatically and in annotation and will generally be handled when the request handler is created:

# Programmatic
api |> 
  api_get(
    path = "/admin",
    function(...) {
      # whatever you wish to protect
    },
    auth_flow = key_guard
  )

# Annotation

#* An example endpoint with auth
#* 
#* @get /admin
#* @auth key_guard
function(...) {
  # whatever you wish to protect
}

Again, both code chunks achieve the same thing. They set up the endpoint to require the key_guard to be passed before further handling takes place.

Multiple guards and requirements

The previous section demonstrates the most basic authentication setup as it only uses the key guard—the simplest guard to configure. We can imagine a situation where we both want to allow users to log in with a username and password or authorize with a key and a google login. This requires defining multiple guards which can be done in sequence:

#* @authGuard key
fireproof::guard_key(
  key_name = "X-API-KEY",
  validate = "MY_VERY_SECRET_KEY"
)
#* @authGuard basic
fireproof::guard_basic(
  validate = function(username, password) {
    username == "thomas" && password == "xrCy45rWrgwq"
  }
)
#* @authGuard google
fireproof::guard_google(
  redirect_url = "https://example.com/auth",
  client_id = "MY_APP_ID",
  client_secret = "SUCHASECRET"
)

We now have 3 guards (of dubious quality) that we can attach to our handler. How do we capture the relationship of requiring either the basic to pass or the key and google to pass? Simple, with a logical expression:

#* An example endpoint with auth
#* 
#* @get /admin
#* @auth basic || (key && google)
function(...) {
  # whatever you wish to protect
}

The names of the guards act as booleans and can be composed with the basic boolean operators (||, &&, and (/)). The combinations are endless!

Scopes

Sometimes you need more granularity in your authentication. Some users may only read while others may read and write to resources. This could be solved with multiple guards but it quickly becomes unwieldy. Instead you can set scope requirements on an endpoint. Guards can then grant scopes to a user in their validate function by returning a character vector instead of a boolean, like this:

#* @authGuard basic
fireproof::guard_basic(
  validate = function(username, password) {
    if (username == "guest") {
      return("read")
    }
    if (username == "thomas" && password == "xrCy45rWrgwq") {
      return(c("read", "write"))
    }
    FALSE
  }
)

#* Read the calendar entries
#* 
#* @get /calendar
#* @auth basic
#* @authScope read
#* 
function(...) {
  # return calendar entries
}

#* Add a new calendar entry
#* 
#* @post /calendar
#* @auth basic
#* @authScope write
#* 
function(...) {
  # update the calendar
}

The authentication that can be integrated is very flexible and will only grow as more guards are added to fireproof.

Other news

Annotation for datastores

While datastores through the firesale package was supported upon release, they could only be set up programmatically. This has now been corrected with the addition of the @datastore tag. It works like this:

#* @datastore my_store
storr::driver_environment()

The my_store proceeding the key is optional and gives the name of the datastore (defaults to datastore). Below the block you provide a storr driver and then you are good to go.

Authentication requires a datastore in order to work as it facilitates persistent session login. Below, you can see an annotation implementation of a single guard that leverages a storr datastore.

#* @datastore ds
storr::driver_environment()

#* @authGuard github
fireproof::guard_github(
  redirect_url = "https://example.com/auth",
  client_id = "MY_APP_ID",
  client_secret = "SUCHASECRET"
)

#* Get a summary of your github commit history
#* 
#* @auth github
function(ds) {
  github_token <- ds$session$github$token$access_token
  # Use the access token to fetch commit history and do some fun things
}

More powerful report support

The report endpoint has gotten even more powerful in this release in a number of ways:

Report endpoints can now be added programmatically as well using api_report()
There is now support for quarto documents using the jupyter engine
OpenAPI documentation is now generated automatically for the report and incorporates the standard annotation known from request handler blocks.
Parameterised reports now has their parameters type checked and casted based on the type of the default values or on explicit type specification in the @param tags.
You can now request specific named output formats through the /{output_format} subpath. This is in addition to the content negotiation already available. E.g. /report/revealjs will request the revealjs format of the report served at /report.
Caches can now be user specific if the rendering includes information specific to the user requesting it
Caches can now be cleared using a DELETE request

Thank you

I want to say thanks to everyone who has given plumber2 a spin. It takes some time to reach maturity when replacing a decade old package and every test spin brings more insight. With the addition of OTEL integration and auth support plumber2 has now reached the feature set I was planning for during the initial development and the next phase will be about refinement, performance, and bug fixes. Your input and experiences will be critical there.