I've run Funkatron.com on ExpressionEngine for a long time. It's a strong, flexible, powerful CMS that I recommend to many folks who need to build content-driven sites, especially those where the site will be administered by non-programmers. It's a very solid choice for consultants and freelancers, who can build the relatively small license cost into their fees -- and probably save the client in the long run.
However, over the years, I've found ExpressionEngine to be overkill for a blog site. If you're only maintaining one or two types of content (like blog posts and links), the overhead involved in maintaining the system becomes unwieldy. I've long put off upgrading to EE2 because I dreaded having to convert my templates, find replacements for incompatible addons, and finding all the little bugs that seem to pop up in a major version upgrade.
Jekyll is quite the opposite of a system like ExpressionEngine. It's entirely command-line driven, so you have to be comfortable with the shell and some scripting. It's really aimed at web programmers who want a simple tool to publish a blog, without worrying about all the overhead of databases, web-based administration, and a monster code base. You have very fine-grained control over how everything works.
The simplicity of Jekyll has been appealing to me since I encountered it when messing around with the GitHub Pages feature. I never ended up doing a lot with GitHub Pages, but the idea of a simple, script-driven blog generator has continued to be very appealing to me.
A few weeks ago, I decided I either needed to bite the bullet and either upgrade to EE2, or switch to another system. I decided to do the latter, and give Jekyll a shot.
This post won't go into every detail about running a Jekyll site. You should definitely read the docs and get familiar with how it works. I'll just be covering issues of note particular to my move from EE to Jekyll. I also encourage you to check out my GitHub repo for funkatron.com, which contains almost everything I use for my Jekyll site.
The biggest challenge was sorting out how to get the data out of ExpressionEngine. Each entry in my blog "weblog" (an EE term for a collection of a content type) consisted of these fields:
From that, I needed to get my posts into flat files, consisting of some Yaml at the top, and the post content underneath.
The typical approach for exporting data from an existing system is to directly query the DB, but this is a bit more complex with EE. EE has a very, very complex schema that's basically a case study in why RDBMSes are not suitable for flexible content storage. So yeah, I could dive into that monster, but I was not excited about doing it.
Thankfully, a little Googlin' turned up a much more sensible solution: Just use EE itself to generate a dump file. Because you can generate just about any kind of content format with EE's templates, you can let the CMS itself do the dirty work of pulling the post content together, and format it as needed.
I basically followed the approach Robin Fay suggests in her blog. She's dumping data out to import into Movable Type, though, so I had to do things a little differently.
exp
exp
group called
comments
.Yes, this will output the comments as HTML. More on that later. 3. I
created a template in the exp
group called
posts_jekyll
.
Here I am generating the Yaml "front matter" to include some required
stuff (like layout
and title
) and also include
some custom properties, like author
and
old_entry_id
. I'm also spitting out category
in the Yaml-prescribed way to do an array of items. Some of these custom
properties I didn't end up needing, but I think it's much better to have
more data than less, especially when I wasn't 100% sure of what I'd
need.
body
and
extended
after the Yaml, without any wrapping HTML – these
properties will be output post-formatting filters, so they'll already be
HTML. Finally, if the post has comments, I embed that
comments
template we created to spit them out as HTML. 4.
Finally, I pulled the data down with curl and into a local file
<pre><code>curl http://funkatron.com/exp/posts_jekyll > ee_posts.txt</code></pre>
I iterated and refactored this export dump a lot, because I had to be sure it contained all the data I needed. I was dealing with nearly 1000 posts, so fixing them afterwards if I had missed something would have been a huge pain. Depending on your own ExpressionEngine setup, you may have more or less to worry about. Be prepared to experiment.
Note that in my dump, I just have the comments embedded at the end of the posts as HTML. I did this because:
So basically, I wanted to keep what was there for posterity, but I wasn't going to include commenting ability in the new setup. That made things easier for me. If you want to keep comments, you might look into dumping your comment data into Disqus.
Looking back, I probably could have delineated the comments section better. I'm not sure I'm 100% happy with how I did it, but it's acceptable.
So I've generated one big text file with all my post data. Jekyll wants each post in its own file, so I needed to sort out how to split the dump file. Anticipating this, I included a bit of HTML at the end of each post to mark the end:
<!-- end of post -->
Jekyll is also rather picky about how the files are names, so I needed to parse the Yaml to generate the filename.
I looked into several bash scripting approaches, but I had trouble adapting them to what I needed. In the end I went with what I know, and wrote a small PHP shell script to split the files as I needed:
This script pulls in the contents of the file, splits it based on my
post delimiter, and then iterates over each post in the
$split
array, pulling out the Yaml data and using
preg_match()
to grab the date
and
url_title
properties to build the filename. It then writes
out the post file with that filename.
Now I had a bunch of Jekyll-format post files. I put these in the
_posts
directory.
Probably the hardest bit of this process was figuring out the
templating system. There aren't any examples or best practices described
in the Jekyll docs, so I ended up futzing around a lot. What helped the
most was finding a couple folks who put their entire blog up on
GitHub, so I could see exactly how they did it and rip them
off borrow their approaches.
Things made a lot more sense when I realized that the templates
themselves can have Yaml at the top, and they can use other templates as
a "parent." So in my setup, I have a base layout called default.html
,
and a post.html
which uses default.html
as its parent. When
post.html
is parsed, its content is injected into the
default.html
template in the {{content}} placeholder
.
I also have a page.html
layout template for non-post,
informational pages (like "About"). So in my _posts
files,
I put layout: post
in the Yaml, and it will use the
post.html
layout. In my other pages, I use
layout: page
.
Getting my site looking the way I wanted was a bit of a challenge, because Jekyll will regenerate the entire site every time you republish. It didn't help that I am using Less, which also requires a conversion step. If I make one change to a template, it will have to process nearly 1000 posts. Making changes was a bit of a chore because of this. Two things helped here:
_posts/
directory while
refactoring page design. This makes the generation time much more
tolerable.Depending on your previous setup, you may need to redirect old URLs
to the new ones on your Jekyll site. I do this with an
.htaccess
file on my server (not maintained in Jekyll,
although it could be). It handles more than just stuff for Jekyll, but
here's the pertinent bits for redirecting:
In my Jekyll directory I have a _bin/
folder that
contains a few scripts. The most important one is
rebuild.sh
, which is what I run when I have finished a post
and I'm ready to make it live. The file contains some private info so
it's not in my
GitHub repo, but I included it here with the juicy bits
redacted:
In order, this script:
cd
s to my local Jekyll site directoryjekyll
, which regenerates the site in the
_site
directoryjekyll
finishes, rsync
s the contents
of _site
up to the server hosting funkatron.com.So when I'm happy with this post, I'll just run
_bin/rebuild.sh
, and everything is generated and deployed
for me. It's very clean, very reliable, and I know everything that
happens. That's nice.
One warning: DO NOT use rsync
's --delete
option if you're deploying into a directory that has other,
non-Jekyll-maintained content. I blew away a bunch of files this way,
but thankfully had backups.
Another minor issue is that the RSS feed that is used to generate items.html
updates way more often than I rebuild the blog, so it's woefully out of
date after a day or so. It would probably be better to generate this
page in particular with its own cron job, but I haven't been bothered to
address this yet. Folks can just subscribe to the RSS
feed if they're really compelled.
I consider the move from ExpressionEngine to Jekyll for my blog a pretty big success. It removed a lot of unnecessary overhead and maintenance tasks. It also kinda reminded me that publishing content doesn't have to be a complex process. You don't have to make an application just to publish stuff online. You just need some HTML pages, and maybe a few scripts to make the tedious stuff a bit easier.