Migrating a 160k Word Jekyll Blog to Hugo

The site you’re reading this on had been rendered by Jekyll since September 2012. That’s 8 years of content, totalling 160k words. I create a post about every month, so this is not a small blog anymore. Recently I’ve noticed a slowdown in how prolific I am, and while it would be foolish to put the blame squarely on Jekyll, it certainly deserves some blame.

For me, not only is Jekyll slow, taking over 5 seconds for an incremental build but I find it unstable. Jekyll is written in Ruby and I am not a Ruby developer. For these past 8 years, I’d only install Ruby on a machine for Jekyll. Oftentimes upgrading either Ruby, Jekyll, or a Jekyll plugin would result in a night of frustration – searching baroque stack traces hoping to find a solution. Eventually I’d always stumble upon a solution, but this time I didn’t.

Not to delve too deep into the issue, as it’s now moot since the migration, but Jekyll was complaining about an 0xFF byte in one of my posts, which most definitely did not contain said byte. I spent a night crawling the internet and couldn’t find a solution. This was the straw that broke the camel’s back so to speak. I could have worked around this by pinning everything to aid reproducibility – and all those workarounds seemed like they’d only be a band-aid and further obstruct my enthusiasm for writing.

I needed a better way to write.

Enter Hugo

I decided to migrate 106 articles consisting of over 160k words written these past 8 years to Hugo. Why Hugo?

This is huge for me, I can be anywhere on any machine and I’ll have a fully built website faster than if I started a clean Jekyll built.

The actual migration took place over several days and I estimate I sunk about 14 hours into it. I have a new appreciation for editors as I had to take breaks to rest my fingers and eyes – it’s hard work! I did it without really any automation, as I made sure I at least skimmed (if not reread) each post. As well as a triggering nostalgia, I found typos and missing links so now things should be better than they were before.

Jekyll Scholar

In the early days of blogging, I found great satisfaction in quoting design pattern books or standards. jekyll-scholar is a jekyll extension that provides the tools to properly cite references. It’s a good extension, as I always forget the accepted way to format APA citations. While I would have preferred if Hugo had equivalent functionality built in, it was not a tremendous loss to spell out citations in prose.

For instance, I took the following:

{% quote black-book %}

All too many programmers think that assembly language, or the right compiler, or a particular high-level language, or a certain design approach is the answer to creating high-performance code. (pp. 6)

Know when optimization matters—and then optimize when it does! (pp. 15)

{% endquote %}

And turned it into:

All too many programmers think that assembly language, or the right compiler, or a particular high-level language, or a certain design approach is the answer to creating high-performance code. (pp. 6)

Know when optimization matters—and then optimize when it does! (pp. 15)

Abrash, M. (1997). Michael Abrash’s Graphics Programming Black Book. Coriolis Group.

So sometimes I copied and pasted what jekyll-scholar had rendered to previously, other times I swapped in a hyperlink. Since I only had about 2 dozen citations, it was relatively painless to massage each one on a case by case basis.

Jekyll Assets

Jekyll assets was a necessary evil for me. I’d routinely run into issues with what dependencies, configuration, or syntax was needed so I could manipulate images. Thankfully after a few days of cobbling together the requirements, I could back away slowly and it would work. Obviously a brittle system isn’t ideal and contributed to my desire to seek better solutions.

Hugo has its own built in image processing pipeline and it’s truly been a non-issue. I can plop down an image either via native markdown syntax or a shortcode that lazy loads the image and serves the image optimized for the client’s screen size.

Code Highlighting

Being a developer, naturally predisposes me to writing about code and code should be highlighted. In the migration, there was a lot of code that needed to be translated from Jekyll’s liquid syntax:

{% highlight rust %}
let data = utils::request(“eng.txt.eu4.zip”);
{% endhighlight %}

to the commonly accepted github’s flavor of markdown

```rust
let data = utils::request(“eng.txt.eu4.zip”);
```

While I’m sure someone has an automated way to change code snippets, I preferred to execute a few sed commands as I reviewed a post:

s/{% highlight rust %}/```rust/g
s/{% endhighlight %}/```/g

Other Dependencies (CSS, JS)

There were other moving parts in the blog that while not contributing to the Jekyll slow down specifically, they would slow down the total build time.

Before migrating, I used to employ PostCSS to transform next gen css syntax with imports to a compressed output. With Hugo extended’s built in SCSS support, I migrated all the CSS to SCSS.

uncss was removed as this site uses so little css that it is not worth having this automated as part of the build

I could ditch all the minifiers as Hugo has built in minifiers.

puppeteer was used to convert my rendered resume from html and pdf, but I don’t update my resume often enough to warrant that dependency.

Essentially I took the opportunity during the migration to KonMari a bit and removed any dependency I possibly could. Once the migration finished, I realized that there was nothing left but Hugo.

Hugo Downsides

Hugo has one unsavory aspect: pretty URLs end with a trailing /. So whereas previously I had page URLs looking like:

https://nickb.dev/blog/my-bet-on-rust-has-been-vindicated

Hugo generated URLs look like:

https://nickb.dev/blog/my-bet-on-rust-has-been-vindicated/

Unfortunately, the Hugo maintainer is not receptive to changing or making this behavior configurable. Maybe this isn’t an issue for most, but whenever a URL changes, then there is an increase in the chance of link rot. I try my best to be a good netizen.

Trimming the Trailing Slash

I took a two pronged approach to fixing this issue. First trying to fix the link at the source, so everywhere I had:

<a href='{{ .RelPermalink }}'>Read more...</a>

I updated to:

<a href='{{ .RelPermalink | strings.TrimRight "/" }}'>Read more...</a>

A decent alternative, but not foolproof. I don’t want to go back through all my posts and track down relref shortcode usage, so I employed a server side solution. Since this site runs on Cloudflare Workers, here is a snippet that will redirect requests that end in / to their canonical counterpart.

addEventListener('fetch', event => {
  event.respondWith(handleEvent(event));
})

async function handleEvent(event) {
  const parsedUrl = new URL(event.request.url)
  const pathname = parsedUrl.pathname;
  
  // Redirect requests that end with `/` to the same url without `/` so that
  // we there's essentially one canonical link
  if (pathname.endsWith('/') && pathname !== '/') {
    parsedUrl.pathname = pathname.slice(0, -1);
    return Response.redirect(parsedUrl, 302);
  }

  // ...
}

I used a 302 temporary redirect as I’m very cautious, though according to Google’s Consolidate duplicate URLs a 301 permanent redirect is more appropriate.

Disabling Comments

Comments are disabled! After writing a commenting system and then self hosting an isso instance for several years, I’ve decided I don’t want to maintain a way to submit and retrieve comments on this blog. Instead I’ve decided there are two ways to add to the discussion of an article:

Other sites are better equipped to deal with discussions. I’m not.

On the positive side, disabling comments now makes this site javascript free (not that isso was heavyweight or all that intrusive).

Comments

If you'd like to leave a comment, please email [email protected]