Our infrastructure

There's no sensationalist title today. In case you're just joining us, by the way, I'm James, founder of Sticky, reality's application layer. With my personal blog Exploring the unabstract hot off the press, I had a sudden urge to share my reality: to be, in my own words, 'unabstract'. So what better opportunity than to talk about the reality of Sticky: the infrastructure that makes it all possible?

I can start by sharing something a little bit secret with you - a slide from our fundraising deck.

"We think about ourselves as an OS"

Quite simply, we designed our infrastructure to make that slide true.


The principles in our infrastructure are inspired by the 'Twelve Factors'. This is in my top recommended reading, sort of like 'Developer Island Discs'? (sorry). We think these are the most important parts.

Infrastructure as code/disposability/pets vs cattle

Setting up infrastructure by clicking through the ugly AWS interface may seem the easiest way to get up and running (and it probably is), but what happens when those services fail? Much better is having one command to redeploy the entire stack to a new environment, or to replace broken production at 3am On A Saturday Club Night. We treat infrastructure like cattle not pets: when they get ill, you get another one.

Pets vs Cattle

Specifically, all of our services run in Docker containers on Rancher. We can restart and roll back bad releases through Docker commands in seconds, and spin up a new Rancher host in under a minute.

Build, release, run

We make sure it's impossible to make changes to code at runtime by building services independently of where they run. Specifically, we push built Docker images to Google Container Registry on GCP and run them on Rancher on DigitalOcean behind a managed load balancer.

Stateless processes

Our backend services are stateless, Dockerised node.js applications that talk over HTTP (the classic microservice approach). They can be freely restarted and scaled as they don't write to disk or have any shared memory.

Dev/production parity

Our dev and production environments are almost identical, and we can run all services locally as Docker containers. We use ngrok to test things like Apple Pay, which need to run from a secure origin (HTTPS web page). Ngrok also allows third parties to send real webhooks to our dev environments. We use ngrok commercially (we pay for it) since every service has an ngrok tunnel.


An essential part of our vision is that any company could build Stickyretail. I took this from some documentation we're working on but haven't yet published:

Stickyretail is our flagship app. We often make the analogy that Stickyretail to Sticky is like Microsoft Office is to Windows: it's a great example of what can be built on Windows, and anyone can build something like Office if they have the time and energy. Microsoft built Office so someone else didn't have to, and because it added value to Windows, but anyone could have built it and sold it, and added value to Windows too.
In the same way that Office is just an set of apps that run on Windows, Stickyretail is just an app that runs on Sticky. We're not in a privileged position to have built Stickyretail - you could build it too with React and our SDK. There's no 'special case' or anything undocumented that makes Stickyretail possible. You are as enabled as we are.
Building Stickyretail in this way was very important to us. It made us ensure the SDK was good enough to build serious, enterprise-grade applications, and it helped us design a great developer experience. We were developers in our own ecosystem, or put another way, we ate our own dog food. And we made it delicious.

The consequence of eating our own dogfood in this 'unprivileged' way is that we have to be super strict on our infrastructure's "seams". If the boundary of our frontends and our SDK is blurred, it becomes difficult for anyone to build a frontend like ours because part of the SDK lives in code they don't have, and we fail in our mission.


I really love seeing other companies' "flows". They remind me of the classic interview question "What happens when I type google.com into my web browser?". With a background in C and programming for ARM chips, I always entertained that question by saying there is a syscall and a keyboard interrupt etc etc etc… that surely annoyed some of the hiring managers. Anyway. What happens when someone uses Stickyretail? What is the "flow"? I spent a couple of hours drawing it in Sketch.

Infrastructure flow diagram

What happens next?

We're agile with a lower- and upper-case A, which means retaining optionality in our infrastructure as well as keeping costs low. Sometimes you have to make compromises to achieve this. One compromise we have made is that our API instances share the same single thread for authentication/"regular HTTP requests" as well as compute heavy/"blocking" work like querying events and stringifying large JSON responses. We're planning on abstracting authentication to a third microservice (like "Mail" and "Geolocation"), which will allow us to introduce more services apart from "API" that can be publicly routed to. The first of those will be an events sourcing service, possibly not even written in JavaScript! There really are exciting times ahead.