To the Un-Known!

How I moved from Known to Hugo

April 1, 2020 8 minutes to read

Contents

Visitors of this blog might have noticed I’ve moved it from Known to Hugo recently. Doing it without losing IndieWeb features was quite a hassle, I admit, so I felt the need to document the process. Hopefully, my experience will be of use to someone, and even if not, bragging is half of the fun about blogging anyway, isn’t it?

Why the move?

I’ve been using Known for almost five years. It was actually Known that introduced me to IndieWeb, and I wouldn’t probably even consider joining IndieWeb in 2015 if it wasn’t for Known. Back then, with a hosted solution available, Known didn’t demand any more technical knowledge than I actually had.

Fast forward to 2020, Known didn’t live up to the promise of “Blogger with builtin seamless IndieWeb integration”. I’m not saying a single bad word about Marcus Povey and other developers that kept working on Known through all these years, and I’m really thankful for the system they’re developing (in their spare time, mind you), but without the hosted solution onboarding is hard, and hosting a whole CMS for just a blog is a little overkill.

I would have continued with Known, though. The proverbial straw that broke the back of my camel was a little glitch, or rather the fact that I couldn’t fix it myself. You see, I hardly speak any PHP, although I’ve tried teaching myself several times. I understood that the fix would be simple, but I just couldn’t do it myself and had to wait for Marcus to find spare time to look into it. I couldn’t even figure out how to roll back to a working version! The problem here is, of course, me rather than PHP and Composer that I can’t wrap my head around, but let’s face the fact: I and PHP are incompatible.

I’ve heard good things about Hugo lately, and what persuaded me to give it a try was probably a post by Chris Ferdinandi who moved his site to Hugo from WordPress. I think I bumped into his post last month, around the same time I read a nice article about IPFS and was playing with it, so I decided to make a simple little site with IPFS and Hugo, just to see what it was like.

I fell in love with Hugo immediately. I could figure it out! I felt immense power at my fingertips. I was able to do things with it! I could finally make my own site and really own it. Now it was only a matter of finding enough time for the move, wrote I two weeks ago (in Russian).

Several days later Russia started a full-time war on COVID-19 spread. I had all the spare time I needed.

Getting ready

I’ve been running Known on the cheapest DreamHost Shared Hosting account, along with my RSS-reader and a couple of other things. Eventually, I would replace it with my new Hugo-generated site, but first I needed to get my data out of Known. Known has an export feature, so that shouldn’t be an issue, right?

Wrong. I’ve been running Known for too long. Gradual upgrades managed to introduce nasty discrepancies in the database, so the exported RSS feed would only go back to late 2016. It also would give me compressed images without the full-size originals. Some links to the original images would be illegible… Too much to fix by hand.

But first I put up a little Hugo-generated prototype on a sub-domain (the hosting plan includes five subdomains, I was only using one, so I had spares), dropped a couple of latest posts in manually by copying and pasting, and started figuring out how much work it would be to make things IndieWeb-compatible.

IWC wiki was my starting point. I opted for the hyde-hyde theme that featured some very basic IndieWeb features and started building manually from there. A nice writeup by Amit Gawande proved very useful in sorting out most of the basics and figuring out what exactly I wanted to save from my Known site.

Eventually, I wrote my own scraper that simply goes through a Known site and saves all it can to a Hugo-compatible folder structure of Markdown files, with all the front-matter and stuff. It even saves webmentions for each page to a JSON file. Basically, all the precious content I’ve posted over the years, except for the 6 pages it couldn’t figure out because even the links to them were butchered by some update in 2016 beyond all repair. Those 6 pages I saved manually, not a big deal.

My tool also generated aliases front-matter fields so that the posts would be reachable at the old URLs. I didn’t bother with picture URLs, though.

Then I wrote one last post with Known explaining how this old RSS feed would not be updated anymore and linking to the new one. I then saved the RSS feed to a file and made a redirect to it from Known’s RSS URL in an .htaccess file.

All set. Time to swap symlinks between Known and Hugo’s public.

What about webmentions?

The biggest IndieWeb hiccup with a static site. How do you receive them? How do you send?

I decided I wouldn’t store incoming webmentions on my site for now. With all these latest insane rules with GDPR and the like, it’s simply too much hassle. Webmention.io suits me just fine, and with the API I can make regular backups in case I decide to change my mind later. And as for showing the mentions up on the pages of my site, it’s webmention.js for now, and I can change things later if I don’t like the way they are.

Sending webmentions is another matter, it can’t be easily outsourced like receiving them. Somehow some software must become aware that something changed on my site. I spent quite some time searching for a suitable tool, and it looks like most people running static IndieWeb sites opt for either sending webmentions manually (not my thing entirely) or use some active component like lazymention or stapibas, others use Netlify’s special magic. I wasn’t moving over to Netlify anyway, my hosting doesn’t allow running a Node.JS app, and I didn’t leave Known to end up in a PHP situation again, so I needed another solution.

How hard could it really be anyway? Let’s say I copy all my Hugo-generated public out to some temporary directory, and generate the updated version. All I need is a tool to scan both directories, figure out which pages are new, modified, or gone, scan these pages for links and send the webmentions. And sending webmentions is not that hard anyway, there’s a neat Go library to do just that with a couple of function calls.

Of course I ended up writing my own tool. It does just that: compares the two directories, figures out what’s new, changed, or gone, and sends out all the webmentions to where they belong. And while it’s at it, it also scans the site for changed XML feeds and sends out the WebSub pings to the hub where they belong.

The rest is fairly straightforward. Brid.gy has been my friend for ages, their fediverse connector I only lately started trying, but that’s easy to set up, too. I still have a lot of things to sort out and polish, and I’m hoping to sneak all my IndieWeb-related work into the hyde-hyde theme I’m using, and in case it doesn’t get merged, maybe I’ll fork it, but that’s still in future.

Publishing automation

One last thing to mention: the way I have set up my publishing process for now. Blogging needs to be fun, and doing all the little steps by hand over and over again is the opposite of fun in my book. Ideally, I need a simple process: write a new post or whatever I’m about to publish, hit a button and have all the rest done automagically. Well, what I’ve set up is fairly close.

Hugo is good when paired with Git, so no need to re-invent the bicycle. I write up a post in a Markdown editor, save it to where it belongs in Hugo directory structure, git commit and git push. The Git repository is hosted on a Gitolite-powered server that is a Raspberry Pi in my cupboard. There is a server-side Git hook that runs a script (via ssh) in my DreamHost account, and that script is something along the lines of:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


#!/bin/bash
set -e
cd /home/username/hugo/
git pull --recurse-submodules
mkdir staging
hugo -d staging/public
mkdir old
rsync -am --del public old
ln -s ../../repository/ staging/public/repo
# and other symlinks I need
# for various things I host on this site
rsync -am --del staging/public ./
static-webmentions
# and clean up the now-unnecessary directories

And, of course, the same script is run by cron, too, so I could have this post published after midnight and not have it be published on April, 1. But hey, one can argue this whole blog is a fool’s errand, so why not.