Entropic Thoughts

Deploying a Single-Binary Haskell Web App

Deploying a Single-Binary Haskell Web App

Since I now write Haskell for paycheck, I wanted to refresh my Yesod knowledge. Yesod is a production-grade web framework, and there’s one web site I’ve long wanted to have: Fisher’s Fountain. Fisher’s Fountain is the ideal project for a toe-dipping three-evening hack because it does not talk to external collaborators, but it is also not entirely static content.

This is a significant step. Note that aside from this, every side project I have announced in the past six years (Flightle, FlowRatio, Engineering Enigmas, Tagnostic, Precel, xmr.html, the Ship Investor game, the Basean Loom) has been either

This is not accidental. I hate having to actively maintain deployments of side projects. By making projects only of those two types of implementations, I don’t sign up to monitor the uptime of anything beyond nginx, and nginx never goes down.

In that light, it is clear that Fisher’s Fountain represents a significant breach of policy: it requires monitoring the uptime of a separate server process. But! It is just a single binary written in Haskell with no complicated moving parts, so it should be about as reliable as nginx.1 The attentive will discover that there is nothing about Fisher’s Fountain that couldn’t be made in client-side JavaScript. This is a feature: I wanted to learn Yesod, but if it ever becomes too much to maintain the deployment, It can be replaced with JavaScript in the space of an evening.

Challenges in achieving single-binary deployment

Although Haskell is a compiled language and the default build step results in a single binary for a Yesod site, it is not a deployable single binary, because it may still depend on libraries and resources on the machine it was built.

To get to a point where deployment consists only of copying over a binary and restarting a process, there were a few more steps needed beyond the default.

  • We need to compile external resources such as typefaces into the binary.
  • We need to statically link against a different libc.

Other good practices for deploying web services include:

  • Switching the Haskell server to listen only on localhost.
  • Telling the Yesod site what its root url is when it sits behind a reverse proxy.

We’ll look at some of the details involved. The code for the site is 100 lines of Haskell, so you can check that out if you want to follow along more actively.2 The linked file is 200 lines, but half of that is stylesheets and base html template.

Embedding static content

Fisher’s Fountain depends on a typeface with specific measurements in order to align nicely on the pages. Distributing the typeface separately along with the binary would complicate deployment, because then we need to care for search paths and file system privileges.

Fortunately, Yesod offers the ability to compile resources like typefaces into the binary. Building resources into the binary is called “embedding static content” in Yesod-speak. To do it, we first need a Template Haskell call

mkEmbeddedStatic False "embeddedLib" [embedDirAt "" "lib"]

This creates a bunch of variables that holds references to all the static content under the lib directory. We will need a function that gets us this static content, but since it is a constant embedding with no fancy logic, we can make that a constant function.

_embeddedLib = const embeddedLib

The initial Template Haskell call also created a subsite for serving the embedded static content, and we install that under a route:

mkYesod "App" [parseRoutes|
/ HomeR GET
/lib StaticLibR EmbeddedStatic _embeddedLib
/unif UnifR GET
/pois/#Double PoisR GET
|]

We then tell Yesod how it can get the static content.

instance Yesod App where
  addStaticContent = embedStaticContent _embeddedLib StaticLibR Right

And with this we can finally reference our resources from Shakespeare templates, such as this line in a Lucius stylesheet:

src: url('@{StaticLibR font_LinLibertine_R_woff}') format('woff');

It took a little time to figure out how all this was pieced together, and serving embedded content from Haskell comes at a small performance penalty compared to streaming it from disk with nginx (I think?), but in this case it is well worth not having to bother with external files when maintaining the deployment.

Statically linking against musl

Although the regular cabal build command produces a single binary for a Yesod site, when I tried running that binary on the server it complained about glibc on the machine being two versions off. This is apparently a common problem with Haskell binaries – or indeed anything that links against glibc.

Three alternatives immediately came to mind:

  • Downgrade glibc on my machine to match that of the server,
  • Upgrade glibc on the server to match that on of my machine, or
  • Compile the code on the server.

All of these are bad. The first two come with ridiculous maintenance demands – what, am I going to keep glibc versions in sync across my machines and servers forevermore? The third could sort of work, but the server is a tiny vps and compiling anything on it takes ages.

I looked into statically linking3 Static linking means that a library is bundled with the binary, rather than the binary depending on a library being installed on the machine it’s going to run on. against glibc, but it turns out to be really hard to do that.4 Something about some parts of glibc necessarily remaining dynamic even under static linking. From what I gathered, the same might be true of musl, only less apparently so. I asked for suggestions in #haskell and got a few alternatives, but the one that seemed easiest was statically linking against a different libc: musl. There is a project that aims to help with that but even easier for me was copying the relevant parts of their Earthfile to create a project-specific Dockerfile to build and statically link against musl.

In order for it to run properly under Fedora 40, podman needed to be instructed to work with SELinux rather than against it. The following line accomplished that:

podman run --security-opt label=disable -d -v "$PWD":/mnt:z muslghc

The container used to sit on an infinite sleep after it was done building, to give the user time to copy the binary off of it with a podman cp command5 I even tried automating this by tailing container logs. but I received a comment suggesting a much simpler solution: since the working directory is mounted in the container anyway, we can just let the build script on the container copy the binary into the correct directory as the last build step. Makes things significantly less brittle.6 The same comment suggested mounting a few more local folders to keep a cabal cache between container runs, meaning it does not have to rebuild the whole project from scratch each time it runs. If I performed this production build more often, I’d probably adopt that also. This binary, being statically linked against a libc designed for it, runs just fine on both my machine and the server alike.

New here? I want to write more about Haskell-for-solving-real-problems in the coming few months. You should subscribe to receive weekly summaries of new articles by email. If you don't like it, you can unsubscribe any time.

Configure Warp to listen on localhost only

At this point, if we had copied the binary to the server and started it, the site would have been accessible on the port given.7 Well, would have been if there weren’t a firewall in the way. But for security reasons, we want to limit the number of services that are directly exposed to the internet, so we want the Haskell application to only listen on localhost. Then we’ll use nginx to reverse proxy traffic to it.

Some context might help here. A Yesod application consists of three layers:

  • At the edge against the outside world, there is a web server.
  • In a middle layer is the Web Application Interface (wai).
  • Inside that is the Yesod site.

In other words, the web server does not directly serve a Yesod site. Rather, the Yesod site is turned into a wai application, and that is what the web server serves.8 This is a recurring trend. Django started out as an interface to abstract over http and then grew into a web framework. Yesod started out as a web framework and then grew an offshoot called wai which is an interface to abstract over http.

The web server often used with wai applications is the rather robust and fast Warp server. All Yesod code examples end with a call to the warp function which will bundle a Yesod site into a wai application and then serve it with Warp.

The warp function does not allow selecting which ip address the server should listen on. In order to configure that, we need to use wai functions to configure and start the application. That means we need to first convert the Yesod site to a wai application, and then configure the wai application to listen to the correct ip address and port. This happens thusly:

main = do
  app <- toWaiApp App
  flip W.runSettings app
    . W.setHost "127.0.0.1"
    . W.setPort 8840
    $ W.defaultSettings

With that change, we can still curl the site from the server itself, but no longer access it from the internet.

To get access from interent, we configure nginx to reverse proxy to it.

location /fountain/ {
  proxy_pass http://127.0.0.1:8840/;
}

Set site root URL differently in production

There’s one more problem, though: in nginx, we placed the site under a subdirectory /fountain, but the Yesod site believes it sits on the top level of the server.9 It tries to be clever about this, but since it indeed does site at the top level of its own server, it cannot tell how people reached it through nginx. We need to instruct Yesod that it has a different root url when it is running in production.

To do this, we first need to add a field to the foundation type that holds the root url for the site.

data App = App
  { _approot :: Text
  }

Then we tell Yesod that we have configured a static root url and how it can be extracted from the foundation type.

instance Yesod App where
  approot = ApprootMaster _approot

When the site starts up, this field can be populated from the environment, with a fallback if it is not specified:10 Note that the fallback here is meant to coincide with the specific listening ip and port configured, but they are not guaranteed to be the same. However, they are very close together in the code so I don’t think it will be a great burden to update in both locations at once. It would be trivial to dry this code but at the time I wasn’t sure if it was the right approach at all so I didn’t do it then. The next time I’m making changes to that part of the code I might.

main = do
  approot_env <- maybe "" T.pack <$> lookupEnv "YESOD_APPROOT"
  app <- toWaiApp App
    { _approot = if T.null approot_env then "http://127.0.0.1:8840" else approot_env
    }

Now we have the ability to use a different root url in production. To do this, we set the YESOD_APPROOT environment variable. Doing so is easy to forget, so we’ll take this opportunity to create a systemd service unit. We create /etc/systemd/system/fishers-fountain.service with these contents:

[Unit]
Description=Fisher's Fountain Warp server

[Service]
ExecStart=/home/kqr/fisherfountain
Environment="YESOD_APPROOT=https://xkqr.org/fountain"

[Install]
WantedBy=multi-user.target

Now we can systemctl restart fishers-fountain when we have copied over a new version of the binary, and it will stop the application and set the right environment before starting it again.

Trailing slash shenanigans when reverse proxying

The root url specified in the environment variable does not end with a slash. The location block in nginx does. The proxy_pass directive also does. This specific combination of trailing slashes is important – get it wrong, and links will break. I understand trailing slashes are meaningful in these contexts, but I have to admit I don’t fully understand why this specific combination is correct for this situation. Fortunately, there are only 2³ combinations, so it’s not too difficult to try them all and hope that one works.