RetroFeed Work Log #4 - Middleware, Views, Sessions, Secrets

So far we’ve setup a NPM package and setup NestJs for normal web programming. In this instalment of the RetroFeed we’ll continue with the standard web application setup. We’ll speak about the common middleware we used, about sessions, and a we’ll close with some design thoughts.

The current file structure is more or less:

.
├── Dockerfile
├── config
│   ├── env.local
│   └── env.test
├── knexfile.js
├── migrations #hidden
├── package.json
├── scripts # hidden
├── secrets.tar.enc
├── src
│   ├── auth
│   │   └── auth.ts
│   ├── controllers
│   │   ├── app
│   │   │   ├── admin.hbs
│   │   │   ├── app.ts
│   │   │   ├── home.hbs
│   │   │   └── layout.hbs
│   │   └── tech
│   │       └── status.ts
│   ├── index.ts
│   ├── infra
│   │   ├── common.ts
│   │   ├── config.ts
│   │   ├── db-conn.ts
│   │   ├── request-id-middleware.ts
│   │   ├── request-time-middleware.ts
│   │   ├── request-version-middleware.ts
│   │   └── session-middleware.ts
│   └── services
│       └── user
│           ├── entities.ts
│           ├── events.ts
│           └── service.ts
├── test # hidden
└── test-e2e #hidden

The studied version is v0.0.4. It has the whole set of auth related changes as well. But I’ll cover those in the next post. With regards to the tree above, I’ve hidden a bunch of files and directories which aren’t relevant to the discussion at hand. In any case, you can see that the src tree has become much beefier.

Express and NestJs come prepacked with a lot of goodies. However, since you can build both microservices and regular webapps with them, many things are left optional. So the first thing we must do is set up a bunch of must haves for regular web app development atop the scaffolding we have from the previous post. Luckily, the common form they’ll take is that of a middleware applied to either all of the routes or almost all of them, or some global switch telling NestJs to do something.

Right off the bat, in src/index.ts, there’s a bunch of extra configuration in the boostrap function.

async function bootstrap() {
    const app = await NestFactory.create(MainModule);
    const config = app.get(Config);
    app.setBaseViewsDir(join(__dirname, "controllers/app"));
    app.setViewEngine("hbs");
    app.use(helmet());
    app.use(compression());
    app.useGlobalFilters(new ViewAuthFailedFilter());
    await app.listen(config.port);
}

The first thing that’s setup is the view engine. Any of the ones available for Express are usable, and I’ve chosen Handlebars. There isn’t any particularly strong reason, and in fact most any one could do. The project won’t need massive complexity here. I’ll cover more in a bit about the “rendering philosophy”.

The other two lines are for setting up helmet with default settings and enabling compression. The first one is particularly important, as it provides a baseline of browser oriented security measures - DNS prefetch control, frameguard, HSTS etc. As the project progresses, I’ll enable more of them as well, though for some the cost-benefit analysis doesn’t look so good for such a small project like this one (public key pinning for example). The second middleware is self-explanatory - it enables response compression.

Another simple and nice middleware to have is the request id one. This attaches a globally unique identifier string to the request-response pair. The id is available as res.requestId to the application. It is also attached to the response as the X-RetroFeed-RequestId header. This is very useful for logging and for later analysis, as it can be used as a primary key for the log entry. Application-level logs, external calls etc. can also make use of the same id, so all the actions the application does can be grouped together later. There wasn’t a good TypeScript one to use, so I just added a small one in src/infra/request-id-middleware.ts.

In the same vein there is the request time middleware at src/infra/request-time-middleware.ts. This only exposes a req.requestTime field for the application, as the Date header already serves this purpose for external uses. I plan to do something akin to event sourcing, and having a common time for all mutations resulting from a request is something this will help achieve.

The last small middleware is the application version one at src/infra/request-version-middleware.ts. This simply attaches the X-RetroFeed-Version header to the response. For the LOCAL and TEST environments this is simply set to latest, but the staging and live environments will receive something like v0.1.4 or whatever the actual version of the application is. The value comes from the config and the build&deployment process will be responsible for properly setting it in production.

Things start to get complicated when setting up sessions. Again, in the spirit of using as much as possible from the ecosystem, I’m using the express-session middleware with the connect-session-knex store. As mentioned, we’ll use knex as a micro-ORM, and it’s good to piggyback on all the infrastructure for it for sessions. The rather big configuration is in src/infra/session-middleware.ts. express-session works the way you’d imagine - it sets a cookie on responses which then gets transmitted through the usual mechanisms. This cookie is signed by a secret we need to provide (see below for secrets management), and it contains an identifier sid, which is used to lookup “session-level” information from the store. The auth system we’ll speak about in the next post will use this mechanism to great effect. For our purposes the sessions never expire[1] and are more-or-less immutable after creation. The cookie is also http only, secure in non-local environments and has a SameSite=Lax setting.

In the migrations/0002.create-sessions-table.js migration we setup the following auth.sessions table which is used by the knex store. It’s structure is:

CREATE TABLE auth.sessions (
    sid VARCHAR(255) PRIMARY KEY,
    sess JSON NOT NULL,
    expired TIMESTAMP WITH TIME ZONE NOT NULL
);
CREATE UNIQUE INDEX sessions_expired ON auth.sessions(expired);

This is a technical table, rather than an entities one. Therefore, we don’t have a session_events or any extra stuff on top of it.[2]

The helmet, compression and our small custom middleware are applied for all the routes globally. Session and auth related ones aren’t needed globally, or are needed in different fashions. For example, we don’t care about sessions for the /status call. Or for API calls we don’t want to create a session, but rather assume it exists. To that end there’s four types of routes we have:

Application routes - these generate the UI of the application in one way or another. The various feeds we’ll have also enter here. We want all the middleware here, and sessions can be created.
API routes - these are for the browser-side SPA code to interact with the backend. We want all the middleware here, and we assume a session exists and won’t ever create one by hand. On the assumption that a human user must first visit a page (and get a session), and then API calls would be made.
Integration routes - these are things like /robots.txt handlers, various domain checks, browser-config.xml, webhooks, auth handlers etc. We want all the middleware here, but the sessions one.
Technical routes - status checks, developer admin panels, debug pages, API interaction pages etc. Again, we want all the middleware here, but the session one. These should have limited access from the outside world.

For comparison of the effects, a GET / used to return something basic like:

HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 19
Content-Type: text/html; charset=utf-8
Date: Sat, 11 Aug 2018 12:29:10 GMT
ETag: W/"13-yJNY7FvAzljr3cH9QOT6Sna+sPs"
X-Powered-By: Express

Hello World - LOCAL

And now it produces:

HTTP/1.1 200 OK
Connection: keep-alive
Content-Length: 232
Content-Type: text/html; charset=utf-8
Date: Sat, 11 Aug 2018 12:30:02 GMT
ETag: W/"e8-+wYItcFKggKlbJpYhiJU65HnWwk"
Strict-Transport-Security: max-age=15552000; includeSubDomains
Vary: Accept-Encoding
X-Content-Type-Options: nosniff
X-DNS-Prefetch-Control: off
X-Download-Options: noopen
X-Frame-Options: SAMEORIGIN
X-RetroFeed-RequestId: y1orkfd2JbhCKZnWr4xfP0dB6tIQVvHt
X-RetroFeed-Version: latest
X-XSS-Protection: 1; mode=block
set-cookie: retrofeed.session=s%3A8ra0QxZqCmJJV2PuiPzHJlZ78profFCu.A2uIakPpIRqU9aLfur41V%2FHqN5Ec57w3FXI38pzvuWM; Domain=localhost; Path=/; Expires=Tue, 10 Aug 2060 12:30:02 GMT; HttpOnly; SameSite=Lax

<!doctype html>
...rest of content

When setting up the middleware I used the express middleware list to get a feel for the common ones I shouldn’t miss out. For the whole project I want to avoid reinventing the wheel as much as possible, and having these lists of middleware/best practices/common approaches is really helpful.

To close this post, I’ll deal with three misc topics. But they’re interesting things to deal with sooner rather than later.

First of - environments. Most code will be run in multiple locations. At least the developer’s machine, and the place where it’s supposed to finally work (a factory, a car, a web server etc). I’ve chosen to define four environments, which are:

LOCAL - the developer’s own machine, while they’re building things locally.
TEST - the Travis machine/job which runs tests & does deployments.
STAGING - one or more machines on GCP, not accessible to the public, for large scale testing, messing around etc. A smaller-scale clone of the LIVE environment.
LIVE - one or mor emachines on GCP, accessible to the public and storing the real user data. This is where the magic happens.

I refer to the latter two environments as production, and the former two as development. src/infra/common.ts contains code for dealing with this in the application. The $ENV environment variable defines the current config, and this is set externally. It’s assumed LOCAL unless otherwise specified, or it’s set in Travis’ configuration, or by the deployment procedure on each machine. It also appears in the config files so it’s accessible to the application via src/infra/config.ts.

Speaking of configs, they live in the config/env.$ENV files, and as mentioned scripts/setup-env.sh takes care of selecting the right one. In this version there’s also config.$ENV.secrets files, which hold those settings which should not be public - API keys, secret tokens, passwords etc. Since the repository is public, these files are not committed. Instead a secrets.tar.enc file is commited, which contains them. As the name suggests this is an encrypted archive. I use Travis’ file encryption functionality for this. Travis holds the key, and the build script will use it to decrypt the archive at build time, and use it’s contents to run tests and make the appropriate deployments with the correct environments. Currently I hold a copy of the unencrypted files, and any other committers would have to get one as well. If changes are made to the secrets file, the yarn prepare:config command should be run so the non-local environments get the changes as well.

Finally, let’s speak about classical rendering. Single Page Application’s are all the rage now, and you’d think it’s enough to push a scaffold index.html page somewhere and a bunch of .js/.css files and you can call it a day. Sadly, in my experience, that’s a recipe for a lot of grief. First off, there’s a bunch of things which SPAs can’t cover. Everything which could be called “web integration” fits this bill (robots.txt, humans.txt, sitemap.xml, RSS feeds etc). Second, auth flows almost always require a bunch of redirects, webhooks might need non-JS content etc. Third, you might need to have a server process a bundled JS file before sending it to a client. For example, to insert API keys, to add session information, to do server-side rendering etc. So it pays off to have a good classical server-side rendering system setup, and this almost always means a view engine[4]. I’m not advocating we go to the days of doing most of the UI work in them, cause React/Vue/Angular have their virtues, and they help build better applications overall[3]. However, I’ve a feeling that this app won’t require the full might of React - just React and a router, but none of that Redux stuff. So a happy middle between server side rendering and medium-sized chunks of the UI in React looks the best to me. A page reload on big transitions is a decent price to pay.

Anyway, that’s it for this post. It again turned out bigger than expected, but now we’re definitely out of the just setting up phase, and more on the road to something tangible.

Checkout all the articles in the series here.

[1] They expire technically in 2060 or some far off year. The purpose of sessions as used here is to group the interactions of a particular user across all time. Extra processing will be added later, if needed, to split this into an actual UX interaction session. Things are also kept simple with regards to authentication if we assume sessions, and hence the authenticated state never expire automatically once set. Otherwise we (1) need to add all sort of error handling code to deal with session expiration between successive actions or (2) accept a poor experience for users some of the time, as actions fail because their session expires.

[2] This is a made-up separation of mine, though it probably does have a name. Tables like users, feeds, cars, books represent entities, and it’s good practice to have a $entity_events table with the log of all the changes for that entity. It’s a lightweight form of event sourcing. Other tables exist only for bookkeeping, or for analytics, or for plain-old non-entity work (join tables?). I call the first ones entity tables, while the latter ones technical tables.

[3] Unless you really like page reloads because of a small form action.

[4] Server-side rendering doesn’t have to be limited to this of course. You’re basically emitting a byte-stream on a socket. You can use string concatenation, various pages solutions (PHP, JSP, ASP etc). But I think it’s really no contest that the templating engine solutions like Handlebars, Django’s views etc. are much more palatable.