Static site generator

I built a static site generator. It's used to generate all the content on this site. Let me explain the unique environment which led to building a static site generator, then I discuss some details about how it functions.

My environment

I like building things. Software things in particular. I like the malleability of software. I like that a new ideas can significantly improve expressiveness, performance, features and developer productivity. I love the feeling of a clean and cohesive code base. I spend hours on a good refactor like I'm engrossed in a pressure washing video. I love the quick feedback loops that few other disciplines can provide.

I'm really into this software stuff.

I'm also very particular about the software I use. I've built a number of web sites with some of the currently (they seem to change so often!) popular static site generators. They make simple things complex or just don't work the way I want. It's hard to build good abstractions that work for everyone.

I understand HTML and CSS. Most of static site generators are built to translate other markup languages (like Markdown) into web site assets. Other markup languages are great for some people, but I don't need that abstraction when I can speak the destination markup language.

I need a different type of tool. A tool that makes it easy to manage complexity of code re-use. A tool that gives me access to full expressiveness of the destination data format. A tool that can be composed into a larger system. A tool that is simple to understand.

Simple (even leaky) abstraction over HTML/CSS

Instead of choosing a friendlier markup language, let's talk about a structured data representation of HTML and CSS. Once we have structured data, we can simply translate it into HTML and CSS. To get started, let's focus on simply HTML with inline CSS.

The Clojure programming language is my weapon of choice. It's data manipulation primitives make building domain specific languages relatively easy. Clojure has a popular domain specific language for representing HTML and CSS. Hiccup is a simple translation of HTML elements and CSS properties into collections/arrays and maps/objects. Here's what Hiccup looks like in a Clojure REPL:

user=> (html [:span {:class "foo"} "bar"])
"<span class="foo">bar</span>"

Composition

My theory on building static sites is that we can build the most complex static web assets with simple composition). Composition should give us the option to abstract away HTML and CSS (if we want) and build re-usable components (like layouts or common heading styles). I'm confident that this theory will pan out as function composition is my primary tool for building any software in any language.

Clojure has built-in tools for reading EDN data, which employs the same syntax as Clojure data structures. Let's add the Aero library, which offers a set of tag literals to our EDN content. Aero also makes it easy to implement our own tag literals which allows us to add composition to our static site language. We could build everything discussed here simply using clojure.edn but we'd end up re-implementing some of the code that Aero includes.

Process

We're building static content for our website, so let's keep our structured data in static files. We can version static files with git and our tool can guarantee idempotency (the same data will always produce the same output).

Our tool should accept configuration for each of the site's assets, read Aero's tag literals, apply Hiccup rendering then produce the web site assets back to the filesystem.

Configuration and Content

Let's get into what structured data for an Aero/Hiccup HTML+CSS document might look like. Here's the asset configuration for the index page of this website.

{
    :type :html
    :slug "/index.html"
    :content #template ["pages/index.edn" [:content] {:path "/"}]
}

Let's talk about the #template ["pages/index.edn" [:content] {:path "/"}] section.

#template is a custom tag literal. Think of it like a function being called. It's very similar to Aero's built-in #include tag literal except that it adds additional data into it's render context. #template will be the basis of composition from which we build our static content.

["pages/index.edn" [:content] {:path "/"}] are the three arguments to our function.

First, is the path to the template's definition. We'll keep another file with structured data for :slug "index.html" at "pages/index.edn" Side note: "slug" may have been better named "output-path".

Second, a pull selection to filter the output. In a finished version of this tool, you might consider implementing the EDN Query Language but let's stick to a simple vector of keys that can be applied to Clojure's get-in. I would set a sane default (like [:content]) to provide some consistent structure to our template files.

Third, map of input variables. The input is a map so that each template can apply a similar pull syntax for extracting the data it expects. You can include any data structure as input, so it's extremely flexible.

Let's take a look at a template definition.

{
  :color #include "../styles/color.edn"
  :content
    #template ["../components/layout.edn"
               [:content]
               {:title "home"
                :body #ref [:body]}]
  :body
    [:div
      {:style
        #css {:display :flex
              :flex-direction :column
              :justify-content :flex-start
              :align-items :flex-start}}
    [:div
      {:style
        #css {:font-size "50px"
              :font-weight 700
              :color #ref [:color :yellow]}}
    "Hi. I'm Adam Tait."]]
}

This is some of the template definition for this site's (adamtait.com) home page. Hopefully, you first notice the resemblance to Hiccup or HTML. We have a body element with a flexbox column layout and a single text element.

In the site configuration, we said we would be pulling [:content] from the evaluated data of this template, so the :content section is the output. :content renders a layout template which uses :body as input. :body is the heart of our template definition.

Are we there yet?

Given what you've seen of our tool so far, you might extrapolate what a larger site may look like. You'll find ways of reducing complexity by adding sane defaults, refactoring out shared references and templates.

Rich Hickey has a talk titled Are We There Yet? where he talks about incidental complexity. Incidental complexity is hidden; it wasn't requested (or expected), it just comes along for the ride.

Seek simplicity, and distrust it (Alfred North Whitehead)

Our site and template definitions don't hide incidental complexity but they don't hide complexity either. The complexity is (mostly) laid bare. There's not much "magic" to this tool. You have full access to base languages of the system (HTML and CSS) or you can abstract the Hiccup/HTML/CSS away in templates. You have power to build your own tool. The tool you build is one that you deeply understand (you created most of it, afterall) and well adapted to your specific use case.

I built this static site generator because I wanted an honest tool. I wanted to easily understand what data was available at each point. I wanted to build up my abstractions and organize my content in the most intuitive structure for me. Most people would consider this a poor abstraction because it's too raw. What we have built is a tool to build static site generators.

What's next?

As I grow this site, this as-yet-unnamed tool is also maturing. I may eventually open source it (and the code for this site) in it's entirety. I'll post an update if it becomes available.

Published: 2020-11-29

Tagged: dsl clojure blog