Deploying a Single-Binary Haskell Web App
Since I now write Haskell for paycheck, I wanted to refresh my Yesod knowledge. Yesod is a production-grade web framework, and there’s one web site I’ve long wanted to have: Fisher’s Fountain. Fisher’s Fountain is the ideal project for a toe-dipping three-evening hack because it does not talk to external collaborators, but it is also not entirely static content.
This is a significant step. Note that aside from this, every side project I have announced in the past six years (Flightle, FlowRatio, Engineering Enigmas, Tagnostic, Precel, xmr.html, the Ship Investor game, the Basean Loom) has been either
- run-your-own-command, or
- entirely client-side JavaScript.
This is not accidental. I hate having to actively maintain deployments of side projects. By making projects only of those two types of implementations, I don’t sign up to monitor the uptime of anything beyond nginx, and nginx never goes down.
In that light, it is clear that Fisher’s Fountain represents a significant breach of policy: it requires monitoring the uptime of a separate server process. But! It is just a single binary written in Haskell with no complicated moving parts, so it should be about as reliable as nginx.1 The attentive will discover that there is nothing about Fisher’s Fountain that couldn’t be made in client-side JavaScript. This is a feature: I wanted to learn Yesod, but if it ever becomes too much to maintain the deployment, It can be replaced with JavaScript in the space of an evening.
Challenges in achieving single-binary deployment
Although Haskell is a compiled language and the default build step results in a single binary for a Yesod site, it is not a deployable single binary, because it may still depend on libraries and resources on the machine it was built.
To get to a point where deployment consists only of copying over a binary and restarting a process, there were a few more steps needed beyond the default.
- We need to compile external resources such as typefaces into the binary.
- We need to statically link against a different libc.
Other good practices for deploying web services include:
- Switching the Haskell server to listen only on localhost.
- Telling the Yesod site what its root url is when it sits behind a reverse proxy.
We’ll look at some of the details involved. The code for the site is 100 lines of Haskell, so you can check that out if you want to follow along more actively.2 The linked file is 200 lines, but half of that is stylesheets and base html template.
Embedding static content
Fisher’s Fountain depends on a typeface with specific measurements in order to align nicely on the pages. Distributing the typeface separately along with the binary would complicate deployment, because then we need to care for search paths and file system privileges.
Fortunately, Yesod offers the ability to compile resources like typefaces into the binary. Building resources into the binary is called “embedding static content” in Yesod-speak. To do it, we first need a Template Haskell call
mkEmbeddedStatic False "embeddedLib" [embedDirAt "" "lib"]
This creates a bunch of variables that holds references to all the static
content under the lib
directory. We will need a function that gets us this
static content, but since it is a constant embedding with no fancy logic, we can
make that a constant function.
_embeddedLib = const embeddedLib
The initial Template Haskell call also created a subsite for serving the embedded static content, and we install that under a route:
mkYesod "App" [parseRoutes| / HomeR GET /lib StaticLibR EmbeddedStatic _embeddedLib /unif UnifR GET /pois/#Double PoisR GET |]
We then tell Yesod how it can get the static content.
instance Yesod App where addStaticContent = embedStaticContent _embeddedLib StaticLibR Right
And with this we can finally reference our resources from Shakespeare templates, such as this line in a Lucius stylesheet:
src: url('@{StaticLibR font_LinLibertine_R_woff}') format('woff');
It took a little time to figure out how all this was pieced together, and serving embedded content from Haskell comes at a small performance penalty compared to streaming it from disk with nginx (I think?), but in this case it is well worth not having to bother with external files when maintaining the deployment.
Statically linking against musl
Although the regular cabal build
command produces a single binary for a Yesod
site, when I tried running that binary on the server it complained about glibc
on the machine being two versions off. This is apparently a common problem with
Haskell binaries – or indeed anything that links against glibc.
Three alternatives immediately came to mind:
- Downgrade glibc on my machine to match that of the server,
- Upgrade glibc on the server to match that on of my machine, or
- Compile the code on the server.
All of these are bad. The first two come with ridiculous maintenance demands – what, am I going to keep glibc versions in sync across my machines and servers forevermore? The third could sort of work, but the server is a tiny vps and compiling anything on it takes ages.
I looked into statically linking3 Static linking means that a library is bundled with the binary, rather than the binary depending on a library being installed on the machine it’s going to run on. against glibc, but it turns out to be really hard to do that.4 Something about some parts of glibc necessarily remaining dynamic even under static linking. From what I gathered, the same might be true of musl, only less apparently so. I asked for suggestions in #haskell and got a few alternatives, but the one that seemed easiest was statically linking against a different libc: musl. There is a project that aims to help with that but even easier for me was copying the relevant parts of their Earthfile to create a project-specific Dockerfile to build and statically link against musl.
In order for it to run properly under Fedora 40, podman needed to be instructed to work with SELinux rather than against it. The following line accomplished that:
podman run --security-opt label=disable -d -v "$PWD":/mnt:z muslghc
The container used to sit on an infinite sleep after it was done building, to
give the user time to copy the binary off of it with a podman cp
command5 I
even tried automating this by tailing container logs. but I received a comment
suggesting a much simpler solution: since the working directory is mounted in
the container anyway, we can just let the build script on the container copy the
binary into the correct directory as the last build step. Makes things
significantly less brittle.6 The same comment suggested mounting a few more
local folders to keep a cabal cache between container runs, meaning it does not
have to rebuild the whole project from scratch each time it runs. If I performed
this production build more often, I’d probably adopt that also. This binary,
being statically linked against a libc designed for it, runs just fine on both
my machine and the server alike.
New here? I want to write more about Haskell-for-solving-real-problems in the coming few months. You should subscribe to receive weekly summaries of new articles by email. If you don't like it, you can unsubscribe any time.
Configure Warp to listen on localhost only
At this point, if we had copied the binary to the server and started it, the site would have been accessible on the port given.7 Well, would have been if there weren’t a firewall in the way. But for security reasons, we want to limit the number of services that are directly exposed to the internet, so we want the Haskell application to only listen on localhost. Then we’ll use nginx to reverse proxy traffic to it.
Some context might help here. A Yesod application consists of three layers:
- At the edge against the outside world, there is a web server.
- In a middle layer is the Web Application Interface (wai).
- Inside that is the Yesod site.
In other words, the web server does not directly serve a Yesod site. Rather, the Yesod site is turned into a wai application, and that is what the web server serves.8 This is a recurring trend. Django started out as an interface to abstract over http and then grew into a web framework. Yesod started out as a web framework and then grew an offshoot called wai which is an interface to abstract over http.
The web server often used with wai applications is the rather robust and fast
Warp server. All Yesod code examples end with a call to the warp
function
which will bundle a Yesod site into a wai application and then serve it with
Warp.
The warp
function does not allow selecting which ip address the server
should listen on. In order to configure that, we need to use wai functions to
configure and start the application. That means we need to first convert the
Yesod site to a wai application, and then configure the wai application to
listen to the correct ip address and port. This happens thusly:
main = do app <- toWaiApp App flip W.runSettings app . W.setHost "127.0.0.1" . W.setPort 8840 $ W.defaultSettings
With that change, we can still curl the site from the server itself, but no longer access it from the internet.
To get access from interent, we configure nginx to reverse proxy to it.
location /fountain/ { proxy_pass http://127.0.0.1:8840/; }
Set site root URL differently in production
There’s one more problem, though: in nginx, we placed the site under a
subdirectory /fountain
, but the Yesod site believes it sits on the top level
of the server.9 It tries to be clever about this, but since it indeed does
site at the top level of its own server, it cannot tell how people reached it
through nginx. We need to instruct Yesod that it has a different root url
when it is running in production.
To do this, we first need to add a field to the foundation type that holds the root url for the site.
data App = App { _approot :: Text }
Then we tell Yesod that we have configured a static root url and how it can be extracted from the foundation type.
instance Yesod App where approot = ApprootMaster _approot
When the site starts up, this field can be populated from the environment, with a fallback if it is not specified:10 Note that the fallback here is meant to coincide with the specific listening ip and port configured, but they are not guaranteed to be the same. However, they are very close together in the code so I don’t think it will be a great burden to update in both locations at once. It would be trivial to dry this code but at the time I wasn’t sure if it was the right approach at all so I didn’t do it then. The next time I’m making changes to that part of the code I might.
main = do approot_env <- maybe "" T.pack <$> lookupEnv "YESOD_APPROOT" app <- toWaiApp App { _approot = if T.null approot_env then "http://127.0.0.1:8840" else approot_env }
Now we have the ability to use a different root url in production. To do this,
we set the YESOD_APPROOT
environment variable. Doing so is easy to forget, so
we’ll take this opportunity to create a systemd service unit. We create
/etc/systemd/system/fishers-fountain.service
with these contents:
[Unit] Description=Fisher's Fountain Warp server [Service] ExecStart=/home/kqr/fisherfountain Environment="YESOD_APPROOT=https://xkqr.org/fountain" [Install] WantedBy=multi-user.target
Now we can systemctl restart fishers-fountain
when we have copied over a new
version of the binary, and it will stop the application and set the right
environment before starting it again.
Trailing slash shenanigans when reverse proxying
The root url specified in the environment variable does not end with a slash.
The location block in nginx does. The proxy_pass
directive also does. This
specific combination of trailing slashes is important – get it wrong, and links
will break. I understand trailing slashes are meaningful in these contexts, but
I have to admit I don’t fully understand why this specific combination is
correct for this situation. Fortunately, there are only 2³ combinations, so it’s
not too difficult to try them all and hope that one works.