A middleware architecture for Cloudflare Workers

An architecture that enables you to create modular API behaviour that runs in response to the method and more complex route matching of the path from the HTTP request inside your Cloudflare Workers.

October 17th, 2020

TL;DR

This is the fourth post in a series focussing on the use of Cloudflare Workers as APIs. In this post, a middleware architecture, similar to frameworks like express, is presented that enables you to create modular API behaviour that runs in response to the method and more complex matching of the path from the HTTP request, all inside your Cloudflare Workers. The post culminates with a video demo of the API middleware in action.

The challenge

In the previous post, I showed you how to encapsulate the API logic into an abstraction that represents a RESTful resource, with minimal integration into the runtime to enable flexibility and portability. Higher-level API management layers, gateways and frameworks typically allow you to associate HTTP methods and path matching patterns with specific handling code. For example, I might say for all GET requests that match the path pattern /account/:id send the request to the getAccount function, where the function will be provided with the raw HTTP request and some convenient way to access matched and parsed parameters.

While the simplicity of the Workers routing is great for enabling use cases like zero downtime deployments, it’s not great for mapping HTTP requests to API endpoints – but Cloudflare Workers wasn’t designed to be an API gateway. You can use a wildcard (*) in matching patterns but only at the beginning of the hostname and the end of the path, and there's no support for parameter placeholders. So the following are ok:

*api.somewhere.com/account*
api.somewhere.com/account/something*

But these aren't:

api.somewhere.com/account/*/something
api.somewhere.com/account/:id/something

The last example above is a valid route, it just won't do what you're probably trying to do, i.e. use :id as a placeholder for any value and provide that value in an easily accessible way in the Worker.

Also, note in the valid examples that the pattern doesn't include the trailing slash of the path before the wildcard, this is so the pattern still matches on requests to the root of said path/resource (with or without the trailing slash).

The solution is not so different from how you might do it in other execution contexts, such as containers, if you aren’t using an API gateway - middleware i.e. you move the HTTP request handling logic into your code, as you do with frameworks like express:

const express = require('express')
const app = express()

app.get('/account/:id', getAccount)

function getAccount(req, res) {
  const id = req.params.id
  ...
}

The above code is configuring the express middleware to run the getAccount function on the get method for paths that match /account/:id in the HTTP request (where :id is a placeholder for an arbitrary value).

I'll show you how to implement similar functionality in the RESTful resource abstractions for your API in Cloudflare Workers.

Middleware

For this article, I'm going to define middleware as discrete software modules organised in ordered sequence within a request/response architecture such that each module has the opportunity to inspect and modify the request and/or response before passing them on to the next module until the sequence is completed. Middleware is often visualised conceptually as follows:

Conceptual architecture of middleware

It isn't often clear from the conceptual view that each middleware has the opportunity to complete the sequence by sending a response. For instance, a middleware that is handling authentication might return a response with a 401 status code when authentication fails, before any other middleware is executed. It can be helpful to visualise it as a pipeline where each middleware can return a response or move on to the next middleware until the pipeline completes:

Middleware architecture as a pipeline

I'm going to implement the middleware architecture as a pipeline of modules you can run based on matching the method and path from the HTTP request. For each API resource you'll be able to configure multiple middleware pipelines to handle different combinations of methods and paths:

Middleware architecture as a series of pipelines in a resource

Let me show you how I did it.

Implementation

There's a repo with the code created for this post. I use snippets of the code in this post for brevity and to convey the concept, but I'll link to the original file when I do this.

Framework

I've created a lightweight middleware framework based around a Middleware object. The object exposes an on method that creates a MiddlewarePipeline object per unique combination of HTTP method and path (or route):

class Middleware {
  ...
  on({ route = requiredParam('route'), method = requiredParam('method') } = {}) {
    let pipeline = pipelineFromRouteAndMethod(this.pipelines, route, method)
    if (!pipeline) {
      pipeline = {
        route,
        method,
        middleware: new MiddlewarePipeline(),
      }
      this.pipelines.push(pipeline)
    }
    return pipeline.middleware
  }
  ...
}

In the original file, there are utility functions that help store and resolve the correct pipeline based on the HTTP request. Once the middleware pipelines are configured, you run it:

class Middleware {
  ...
  run(request, response) {
    const pipelineResult = pipelineFromRequest(this.pipelines, this.routeMatcher, request)

    if (pipelineResult.found) {
      request.routeComponents = pipelineResult.pipeline.routeComponents
      return pipelineResult.pipeline.middleware.run(request, response)
    }

    if (pipelineResult.reason === 'method') {
      return response.notAllowed()
    }

    return response.notFound()
  }
  ...
}

The run method processes the request and attempts to resolve a pipeline from the method and path, if it finds one, it runs it with any parameters extracted from the path based on the route spec. If there's no match it attempts to figure out why and return an appropriate response, either not found of not allowed for the specified method.

The real work of the middleware is done in the MiddlewarePipeline class. For that I found an implementation of a minimal middleware pattern, which I've adapted and specialised for my use case:

class MiddlewarePipeline {
  use(module) {
    this.run = ((pipeline) => (request, response, next) =>
      pipeline(request, response, () => {
        try {
          module.apply(this, [
            request,
            response,
            next ? next.bind.apply(next, [request, response]) : null,
          ])
        } catch (err) {
          log.error(`Error while running middleware pipeline`, err)
          response.error()
        }
      }))(this.run)

    return this
  }

  async run(request, response, last) {
    last.apply(this, request, response)
  }
}

It's fairly obtuse due to the functional concepts it uses, but in essence, it exposes a use method that returns itself so that method chaining can be used to specify the modules you want to run in an ordered sequence.

Each subsequent module is assigned to the previous module's next parameter, which it should invoke if and when the execution should proceed to the next module in the sequence. The last module in the sequence won't have the next parameter and should terminate the execution with an appropriate response.

Each module is passed the request and response objects that are initially passed into the run method, which is what gets called from the run method in the Middleware object.

Each module in the middleware pipeline can modify the request and response objects, and indeed return a response before moving onto the next module.

Asynchronous data flow

Speaking of the response. There's some work to do around managing responses in the middleware framework. I want to ensure the data flow through the middleware architecture is asynchronous, and I also need to consider the substitutable execution context discussed in my previous post. This is how:

Data flow through middleware architecture

When the request arrives at the Worker it will be sent to the API resource based on the top-level route matching in Cloudflare. The result from the API resource needs to be a Fetch Response that gets sent to the calling HTTP client.

To keep things asynchronous, the API resource will return a Promise that will always resolve. Inside the promise, the middleware will be configured and run, culminating in a response being sent via a HttpResponse abstraction. This abstraction will resolve the promise using a respond method that's available on the Execution Context (which of course is substitutable based on where it's running), and that will send a response back to the client.

Bringing it all together

Let's look at the account API resource to see how this all comes together:

async function resource(event) {
  return new Promise((resolve) => {
    try {
      const basePath = 'account'
      const response = new HttpResponse({ request: event.request, route: basePath, resolve })

      const middleware = new Middleware({
        routeMatcher: new RouteMatcher(basePath),
      })

      middleware
        .on({ method: HttpMethod.GET, route: '/:id' })
        .use(require('../middleware/debug'))
        .use(require('../middleware/auth'))
        .use(require('../middleware/account/validate'))
        .use(require('../middleware/account/read'))

      middleware
        .on({ method: HttpMethod.POST, route: '/:id' })
        .use(require('../middleware/debug'))
        .use(require('../middleware/auth'))
        .use(require('../middleware/account/validate'))
        .use(require('../middleware/account/write'))

      middleware.run(event.request, response)
    } catch (err) {
      log.error('error processing account resource', err)
    }
  })
}

As you can see, the API resource creates and returns a Promise that only has a resolve callback. Within the promise it creates a HttpResponse and configures it with the original Worker request, the base route for the API resource (account) and the promise's resolve callback.

It then sets up the Middleware with a RouteMatcher, which is an abstraction used to encapsulate the route matching implementation - in this case the path-parser package.

The Middleware is then configured with 2 pipelines via the on method, one each for the GET and POST requests that match the route account/:id (note the account part of the route is implicit from the base path configured in the route matcher).

Each pipeline has several modules configured using the use method, for this example, I've configured the following modules:

  • debug: outputs some debug logs if the DEBUG environment variable is defined
  • auth: simulates authenticating the user
  • account/validate: simulates validating the request
  • account/read: simulates reading and returning some data for the account id
  • account/write: simulates writing data for the account id

The final step is to run the middleware, passing in the Worker request and the response abstraction, then it will inspect the request and determine which pipeline (if any) to run and respond accordingly.

Modules

So what does a module look like. Let's look at the auth module to understand how a module fits into the architecture:

module.exports = async (request, response, next) => {
  try {
    if (request.headers.get('unauthorised') === 'true') {
      log.info(`request unauthorised`)
      response.unauthorised()
    } else {
      log.info(`request authorised`)
      next()
    }
  } catch (err) {
    log.error('error processing auth middleware', err)
    response.error()
  }
}

It's simply a JavaScript module that takes a request, response and the next module as parameters.

In this example, I'm simulating an authorisation process that will succeed and call the next module unless you pass an unauthorised header, in which case it will send an unauthorised response (401) via the data flow explained above.

Let's look at the last module in the pipeline configured for the GET method (account/read):

module.exports = async (request, response) => {
  try {
    log.info(`reading data...`)

    const dummyData = {
      id: request.routeComponents.id,
    }

    response.ok(dummyData)
  } catch (err) {
    log.error('error processing account/read middleware', err)
    response.error()
  }
}

In this example, I'm simulating reading some data and returning a response. In the response, I'm simply returning the id that was matched and parsed as a route component.

Notice that a next parameter isn't included, that's because this module assumes it should be the last module run in a pipeline and it terminates the pipeline by returning an ok response with the data.

You could also write this module so it doesn't assume it's the last module, you'd just need to agree on how to pass the data via the response to the next module, and then a subsequent module in the pipeline can return the response:

 middleware
   .on({ method: HttpMethod.GET, route: '/:id' })
   ...
   .use(require('../middleware/account/read'))
   .use((request, response) => response.ok(response.data))

Notice the last use creates an inline module that returns the response using the data that was passed in a data field.

Demo

Watch a demo of the API middleware in action:

There you go, that's how you can run a middleware architecture to handle different combinations of HTTP methods and paths for your API requests.

The repo for this post works out of the box as long as you configure it with your Cloudflare account details, as per the README.

Coming next

In the next post, I'll be looking at how we can improve operations by adding observability to Workers, in this particular case I use AWS Cloudwatch Logs and Metrics but the approach can apply to any remote observability tooling.

Make sure you check out the other posts in this series:

  1. Delivering APIs at the edge with Cloudflare Workers
  2. Blue / Green deployments for Cloudflare Workers
  3. Enhancing the development experience for Cloudflare Workers
  4. A middleware architecture for Cloudflare Workers
  5. API observability for Cloudflare Workers