A few months ago I relaunched my website, building on Eleventy to replace my use of AnchorCMS, a lightweight, open source, PHP-based CMS that has been out of active development for about 3 years. I enjoyed Anchor for its simplicity, and I learned a lot from it. I started using it as an alternative to WordPress in 2013. I was in college and had started a blog with WordPress.com the year before, but found it overwhelming for my simple little blog. Anchor afforded me an opportunity to work with something less complicated, and I learned how to setup my own LAMP stack on a VPS to use it. It served me well through 2020 when, thanks to the pandemic, I fell out of what little blogging habit I still had.

When I finally got around to rebuilding my website last year, I knew I wanted to go static for the speed benefits, and with lots of great free hosting options for static sites, saving $5/mo. on a VPS was a nice bonus incentive. Eleventy had been on my radar for a while, and felt like the most approachable choice of static site generator. I launched the site without including my old blog content, and now I've finally gotten around to moving it. Here's how that process went.

Exporting Content from AnchorCMS

First things first, I needed the old post content. Helpfully, Anchor supported writing in markdown, just like Eleventy. Therefore, I thought my best option would simply be to get a database dump of the post content, and then write a script to read through the export and put each post into its own markdown file, with some of the post metadata sprinkled in as front matter. This felt like a great excuse to dust off my Python skills, which turned out to be a great choice as you'll soon see.

It took a few attempts to land on the right export format, but ultimately JSON was the one that did the trick. Anchor uses a MySQL database, so I wrote a quick query and exported the results with MySQL Workbench. Anchor's database is also quite simple, so grabbing the post title, slug, category, description, and created and updated dates was pretty straightforward. If you're using Anchor and want to grab the same data for your posts, the below query does the trick:

SELECT 
	posts.title, posts.slug, posts.description, posts.created, posts.updated, posts.markdown, categories.title as 'category'
FROM `posts`
INNER JOIN `categories`
on (posts.category = categories.id);

With this data selected, a JSON export prepared it for easy processing in Python.

Converting The Data to markdown Files

The process of outputting each entry to its own markdown file is pretty straightforward:

  • Read in the JSON
  • For each entry:
    • Read all metadata
    • Create a markdown file with the post slug
    • Output metadata as YAML-formatted front matter
    • Write post markdown to the file

Python makes each of these steps pretty easy. First, to read a JSON file, we need to import the JSON package to our script and open it:

import json

# open the file
with open('blog_posts.json', 'r') as file:
  post_data = json.load(file)

With this, we now have an array we can loop through and easily read our metadata:

for post in post_data:
  filename = post['slug']
  postTitle = post['title']
  postDescription = post['description']
  postTag = post['category']
  postCreated = post['created']
  postUpdated = post['updated']
  postContent = post['markdown']

Since I was importing nearly a decade's worth of posts, I wanted to organize posts within my Eleventy project instead of dumping a ton of files into the same folder. I decided subfolders for each year would be just enough organization for me. To do this, I needed to parse the post created date for the year and use it to create my folder structure:

import os # import the os package so we can create our output directories
from datetime import datetime # import datetime for date processing

...

# determine date-based output path
# convert created date to datetime object, so we can get the year
created = datetime.strptime(postCreated, '%Y-%m-%d %H:%M:%S')
# convert the updated date to a datetime object
updated = datetime.strptime(postUpdated, '%Y-%m-%d %H:%M:%S')

# some updated timestamps are before created timestamps (possibly a bug from the CMS)
# set updated to the same time if that's the case, since it doesn't make sense
if updated < created:
  updated = created

outputDirBase = 'posts/'
outputDirFinal = outputDirBase + str(created.year) + '/'

# if output directory doesn't exist, create it
# on macOS, this requires sudo permissions
if not os.path.exists(outputDirFinal):
  os.makedirs(outputDirFinal)

One thing I noticed in my database is a bunch of my posts had updated dates that were earlier than created dates. I'm unsure how that could have happened, though I speculate at somepoint in its history Anchor had a bug that caused this. I added a check for that to set the updated date to the same as the created date, since I wanted to include both of these in the front matter later.

To output all of this to markdown, all we need to do is write the strings to a file. Python also makes this pretty straightforward:

# we read the data, now write to a md file, formatting with front matter for Eleventy
newFileName = outputDirFinal + filename + '.md'
newFile = open(newFileName, 'w')
# start front matter
newFile.write('---\n')
newFile.write('title: "' + postTitle.replace('"', '\\"') + '"\n')
newFile.write('description: "' + postDescription.replace('"', '\\"') + '"\n')
newFile.write('date: \'' + created.strftime('%Y-%m-%d') + '\'\n')
newFile.write('updated: \'' + updated.strftime('%Y-%m-%d') + '\'\n')
newFile.write('tags: [\'' + postTag + '\']\n')
newFile.write('---\n\n')

When outputting the front matter I did hit a few hiccups, which I'll discuss later, but the actual process wasn't too difficult. With front matter written, the last remaining item is the post content itself.

While Anchor supported markdown, it actually stored compiled HTML in its markdown field. Interestingly, it also had an HTML field which... stored the compiled HTML. I vaguely recalled a discussion on GitHub about this very issue and skimmed through it to try to remember why it works this way, but I digress. Since I author all of my posts for Eleventy in markdown, I wanted to convert the HTML back to markdown for consistency. I should note that this isn't required - I could have dumped the HTML into the markdown file and Eleventy would be perfectly fine with it. I, however, would know.

Thankfully, there's the very nice package Markdownify for Python that made this simple:

from markdownify import markdownify # import markdownify

...

# inject blog archive notice box
newFile.write('{% include "partials/blog-archive-notice.html" %}\n\n')
# write the post content, converting from HTML to markdown
newFile.write(markdownify(postContent))

Just before the post content, I added a nunjucks include for a little notice I wanted to display on the archive content. Since I'm mass-importing, there's likely something that is a little bit broken and want to set reader expectations accordingly.

And that's it! This little script processed all my old blog content into an Eleventy-ready collection of markdown files with the following folder structure:

posts/
  |- 2013/
    |- the-post-slug.md
  |- 2014/
    |- the-other-post-slug.md
  etc...

Building the Site

Post directory in hand, it was time to feed it to Eleventy. Getting this right took a few attempts, and required multiple edits to the processing script before I successfully built the site again. I ran into the following little hiccups along the way:

Permissions Problems

In order for the script to create directories on macOS, I needed to run it with root permissions. This required me to reset the file permissions on all the generated files in order for Eleventy to be able to read them. I reset them with the following:

cd /path/to/posts/
sudo chown -R $(whoami) .

Markdownify Escaping

Markdownify escaped some of the syntax for some reason. For example, to make a bulleted list in markdown, you simply use * characters on new lines. Markdownify escaped all of these with a backslash for some reason, which caused an error when Eleventy was trying to compile it. It also escaped asterisks used for bolding, and underscores used for underlining. I solved this with a big find and replace on the directory to replace all instances of * with * and \_ with _. Markdownify probably has some options I could've used to solve this, but I didn't explore them.

Stray HTML Entities

I also had a few instances of encoded HTML entities being output on the site when they're in front matter fields. For example, &amp;#039; appearing in a title instead of a single quote '. This is because Anchor stored them encoded, and they're read as plaintext when Eleventy outputs them into my templates. I also solved this with a find and replace, though I probably could have also added a filter to the template.

YAML Formatting Issues

Getting the correct YAML formatting in some of the front matter fields was a little tricky as well. For example, any description text that contained a single quote for a word like "it's" broke Eleventy's compilation due to invalid YAML syntax. As it turns out, YAML has a few options for strings: unquoted, single quoted, and double quoted, each with their own little quirks. Ultimately, I landed on using double quotes since it allows for the use of escape sequences. This required me to be sure to escape any double quotes used in a field, which is less common in these fields (post title and description) than the use of single quotes. You see this happening in the following lines with my use of .replace():

newFile.write('title: "' + postTitle.replace('"', '\\"') + '"\n')
newFile.write('description: "' + postDescription.replace('"', '\\"') + '"\n')

With each of these issues resolved, I had all my old blog post content in my Eleventy site!

Last Steps

Before publishing all the archive content, I had one last required step: redirects. I've posted links to my blog posts various places over the years (like on Twitter), and I planned to shutdown my old site (which lived at blog.joshvickerson.com), but didn't want the old links to die. Since my old blog and new blog both follow the same url structure /posts/the-post-slug, the only thing I needed to do was setup a DNS redirect from blog.joshvickerson.com to joshvickerson.com with the path preserved, and it should all just work. To do this, I added a URL redirect record in Namecheap where I manage my domains.

I do, however, have one task remaining at the time of writing: migrating images. I didn't have many images on my old blog, but there were a few. I haven't yet decided how I want to handle images for my new blog, so I've held off on migrating the images until I figure that out (send me a toot if you have suggestions on this topic!).

Closing Thoughts

Overall, I found this process pretty easy. The little bumps along the way were just that: little. It felt like a daunting task when I first set out, which is part of why I put off the migration for so long. In reality, I spent 4 or 5 hours on the whole process, not including writing this post. I decided to write this post in the hopes that this helps anyone else who may wish to migrate away from AnchorCMS, as well as showcase the ease of adopting Eleventy. If you can get your content into JSON, reformatting to markdown isn't too bad.

I've debated turning the script into a more configurable JSON-to-markdown CLI tool, but haven't started on that (again, send me a toot if this is something you'd be interested in seeing or contributing to). My script in full is below, for anyone who wishes to use or adapt it for their own content migration needs. The script is also available as a Github Gist.

import json
import os
from datetime import datetime
from markdownify import markdownify

outputDirBase = 'posts/'

# open the file
with open('blog_posts.json', 'r') as file:
  post_data = json.load(file)

  for post in post_data:
    filename = post['slug']
    postTitle = post['title']
    postDescription = post['description']
    postTag = post['category']
    postCreated = post['created']
    postUpdated = post['updated']
    postContent = post['markdown']

    # determine date-based output path
    # convert created date to datetime object, so we can get the year
    created = datetime.strptime(postCreated, '%Y-%m-%d %H:%M:%S')
    # convert the updated date to a datetime object
    updated = datetime.strptime(postUpdated, '%Y-%m-%d %H:%M:%S')

    # some updated timestamps are before created timestamps (possibly a bug from the CMS)
    # set updated to the same time if that's the case, since it doesn't make sense
    if updated < created:
      updated = created

    outputDirFinal = outputDirBase + str(created.year) + '/'

    # if output directory doesn't exist, create it
    # on macOS, this requires sudo permissions
    if not os.path.exists(outputDirFinal):
      os.makedirs(outputDirFinal)
    
    # we read the data, now write to a md file, formatting with front matter for Eleventy
    newFileName = outputDirFinal + filename + '.md'
    newFile = open(newFileName, 'w')
    # start front matter
    newFile.write('---\n')
    newFile.write('title: "' + postTitle.replace('"', '\\"') + '"\n')
    newFile.write('description: "' + postDescription.replace('"', '\\"') + '"\n')
    newFile.write('date: \'' + created.strftime('%Y-%m-%d') + '\'\n')
    newFile.write('updated: \'' + updated.strftime('%Y-%m-%d') + '\'\n')
    newFile.write('tags: [\'' + postTag + '\']\n')
    newFile.write('---\n\n')
    # end front matter
    
    # uncomment the next line to inject blog archive notice box
    # newFile.write('{% include "partials/blog-archive-notice.html" %}\n\n')
    
    # write the post content, converting from HTML to markdown
    newFile.write(markdownify(postContent))
    # log output
    print("Wrote: " + newFileName)