Date published: 14 Dec 2011
I recently converted my blog hosted on wordpress.com
to use the Jekyll static blog generator.
I'm pretty happy with the results, and it was a fun process getting here, so I
thought I would share my lessons learned.
Lesson #1 - The Docs Aren't Great
First, let me say that Jekyll was not intended to be a commercial product.
Hell, it wasn't even designed to be a terribly popular project. It was a
side-project of one of the Github creators, Tom Preston-Warner. So it's easy
to understand why there isn't a ton of good documentation.
So what do we have? Well, we have a lot of great tutorials from bloggers
who have learned to use Jekyll. After reading a few of the better ones, you
should be on your way to rolling your own awesome blog. Here are some of the
ones that I really liked:
Lesson #2 - There's No Official Jekyll Skeleton
So what questions are left unanswered? Well, for starters,
how do you create a site?!? The
Jekyll usage guide does a
decent job showing you which files are necessary, but it doesn't actually
tell you what those files need to contain. Instead, you're supposed
to clone the source for someone else's site on Github (or
Bitbucket or whatever) and then change it to fit your
needs.
So here's basically how I created my "base" Jekyll site, which didn't include
any blog content:
- I cloned Tom Preston-Warner's site from Github.
- I deleted his CSS files (because I wanted a site that looked very different).
- I deleted his blog entries.
I also had to install the following gems:
That's it! Of course, I had a very ugly and empty blog at this point, but I
had the bare essentials that I needed to start using Jekyll.
Lesson #3 - Converting HTML to Markdown Is Tricky
Remember, my blog was previously hosted on Wordpress.com, which means that I
have to save my posts as HTML. I'm not a huge fan of writing my blog posts
using HTML, so I decided to switch to one of the other markup languages that
Jekyll supports, Markdown.
This was great for all new posts, but did I really need to convert my old
posts? Doesn't Jekyll support HTML too? Well yes, it does, but the HTML that I
was able to extract using the converter that came with the Jekyll gem was
pretty messy.
So I decided to make every blog post use Markdown, hell or high water. Not
only would it make all of my posts compatible with Jekyll, but it would make
it easier to edit or convert my old posts in the future.
The Script
I therefore wrote the following script to help:
This script basically does the following:
- It writes the YAML front matter to a .md file.
- It then tries to convert the HTML content to Markdown using
html2text.py. This script does a
very good job of converting HTML to Markdown, but it failed for me about 40%
of the time.
- It html2text.py does fail, then convert the HTML using
pandoc, which is much more reliable but
worse at generating perfect output.
- Write the Markdown output to the same .md file that contains your YAML front
matter.
I stored all of my exported HTML files in a folder called _archivedposts.
Here's how I generated my blog's content:
$ cd _archivedposts
$ for f in $(ls *.html); do ../htmlplusyml2mkd.sh $f; done
$ mv *.md ../_posts
Cleanup
Of course, neither html2text.py or pandoc are perfect so a lot of
my blog posts were a little mixed up. Manually cleaning up every single one of
my blog posts would have been a major waste of time and effort for me, so I did
the following:
- I checked my Wordpress stats to see what my most "popular" posts were.
- I made a list of every post had more than 50 page views (which ain't bad for
my site).
- I manually made the final touches on those files and ignored the rest.
If you do find a post on my blog that looks a bit jumbled, then I apologize,
but it just wasn't worth my time to fix it manually.
Lesson #4 - Creating Your Own Website From Scratch Is Fun
I used to spend hours every week in college manually tweaking the HTML and
JavaScript in my web sites in the Sun lab. It was lots of fun creating
something that I could share with the entire world these new and exciting (at
the time) technologies.
Then I got a little older and busier, and while I still loved to write on my
web site, I didn't want to have do take care of every single aspect of it any
more. So I started using tools like Plone and
Wordpress to author content. They had nice little
WYSIWYG editors, and someone else worried about things like style, usability,
and performance.
Making my new blog from (near) scratch forced me to think outside of my
usual box about those icky things, and I'm really glad it did. For starters, it
gave me an opportunity to use the underused, more artistic part of my brain.
Also, it gave me an opportunity to learn about new web standards and tools
for managing and creating a modern web site, such as:
- Chrome's "Developer Tools": Chrome has a built-in module that helps you
do things like design and profile a web site. This tool was especially useful
to me when I was tweaking my CSS.
- Google Web Fonts: Did you know that there were web apps that did nothing
but serve up pretty fonts that could be used by other web sites? Me neither,
until I started looking into Typekit and
Google Web Fonts.
- Google Analytics: One of my favorite things about Wordpress.com is that
they have a great statistics page that you can use go gauge the
popularity of your blog. However, you can also use
Google Analytics to gather the same basic
statistics (and more) for your static blog. And you can do this all for
free.
Lesson #5 - Converting Comments Is Hard
Simply put, Disqus choked every time I tried to convert my
Wordpress.com comments over, so I just skipped this step. I hope that I don't
offend anyone who's left a comment on my blog in the past, but I only had a
handful in the first place.
Lesson #6 - There's A Jekyll Fork That Makes Some Of These Hard Things Easier
There are lots of Jekyll forks out there that take care of a lot of the
gripes that you see above, but I'm sticking with the canonical copy for now to
make things a little simpler.
Awestruct does look especially compelling to me, and it seems
to be pretty well supported. Once I'm a little more comfortable with my new
site and Jekyll in general, I'll give it a second look.
My Repo
I published the source for my blog here:
It's a little rough, but hopefully it's a good jumping off point.
Good luck!
Date published: 17 Sep 2011
This tutorial shows you how to set up a light-weight mail server on your Ubuntu
system that can send mail to host-only (e.g. tom) and remote
(e.g. tom@tompurl.com) addresses using Gmail as your
SMTP server.
So what the heck does that mean? We’re making it possible for you (and various
programs on your computer) do the following:
$ echo "Hello!" | mail -s "This is cool" tom # Sent to /var/mail/tom spool
$ echo "Hello!" | mail -s "This is cool" tom@tompurl.com # Sent to my Gmail account
So now you may be asking yourself “why anyone would want to so something like
this on a desktop machine that isn’t a mail server? Can’t you just send email
using programs like the Gmail web client and Thunderbird?”. You certainly can,
but it’s not always the best choice.
For example, it you wanted to send an email message from a shell script, the
easiest way to do that is to use the mail command above. Also, your
system may want to send you a message if something weird happens, like a
failed cronjob. Without a working mail server like Exim installed and
configured, those messages are going to end up in /dev/null. So let’s get
started :)
Prerequisites
This tutorial is designed to work with Ubuntu Linux 11.04, but
it may work with other versions of Ubuntu and Debian Linux.
Here’s all of the pertinent software versions that I’m using:
exim4-base 4.74-1ubuntu1.2
exim4-config 4.74-1ubuntu1.2
exim4-daemon-light 4.74-1ubuntu1.2
libmailutils2 1:2.1+dfsg1-7build1
mailutils 1:2.1+dfsg1-7build1
mutt 1.5.21-2ubuntu3
I used
this tutorial on using Exim with Gmail
to set up outgoing mail. If my instructions below don’t work for you, then that
tutorial may be able to help.
Software Installation
This part is super easy:
$ sudo apt-get install exim4-base mailutils mutt
Note: We’re using exim (the Debian default) as our mail server
instead of postfix, which is the default mail server in the Ubuntu world.
You probably don’t care, and for 99% of you it shouldn’t matter. I’m just
pointing it out because this is an Ubuntu-centric tutorial.
The mailutils package gives you a lightweight version of the exim daemon along
with the mail and mailxprograms, which are pretty important if you ever want to
be notified by your system when something strange happens.
Finally, we’re installing mutt, which is a mail reader that you can use in a
console. Please note that you will need to install this program (or something
similar) if you want to read mail that is sent to you by your system. Showing
you how to use mutt is beyond the scope of this tutorial, but if you need some
basic guidance, then I recommend My First Mutt.
Configuration
First, let’s configure exim with debconf using the following
command:
$ sudo dpkg-reconfigure exim4-config
You will now be presented with a configuration wizard. Here’s what
I chose:
- Server Type
- System mail name
- listening ip address
- Other destinations
- machines to relay for
- smarthost ip address
- Hide local mail name
- DNS Queries
- Delivery method
- Split config?
Next, execute the following command:
$ chown root:Debian-exim /etc/exim4/passwd.client
The only step left is to specify your Gmail password.
Open /etc/exim4/passwd.client and add something like this at the bottom of the
file:
*.google.com:tom@tompurl.com:somethingClever
Of course, you’ll want to replace the email address and password :) Please note
that this config works with normal Gmail accounts and accounts that use Google
Apps For Your Domain (like mine).
Testing
Now let’s run a couple of simple tests:
# Please replace "me" with your user account name and verify in mutt
$ echo test | mail -s "test" me # Sends mail to /var/mail/me spool
# Please replace "me@gmail.com" with your actual Gmail address
$ echo test | mail -s "test" me@gmail.com # verify using Gmail
Conclusion
That’s it! I hope that I’ve been able to help a few other people
Date published: 12 Aug 2011
10/27/11 Update - The instructions below work with version 0.9.8 of
Graphite. A new version (0.9.9) has been released that requires a few more
steps. I haven't had time to test out the new version myself yet, but I've been
told that
the following tutorial
does a good job of showing you how to install the latest
version.
This tutorial shows you how I installed Graphite, a fantastic tool for for
visualizing time-series data, on an Ubuntu 10.4 LTS system. The process is
split up into 4 steps:
- Installing and testing Graphite and Carbon in “dev” mode
- Integrating Graphite with Apache
- Making Carbon a managed service
- Password-protecting your Graphite site
Installing In Dev Mode
By “dev” mode, I mean that we’re going to install, run and test Graphite and
Carbon in a “quick-and-dirty” way. You will run all services using your
personal account and you won’t integrate it with a web server (yet). So why am
I doing this? Well, usually it takes less time to set up an app this way, which
saves me time when evaluating new software. Also, I find that you learn a
little more about the “guts” of a new application when you start this way. Of
course, once you have evaluated Graphite and decided to install it on a
separate system, you should skip the “Dev Mode” step and just install it as
managed service (which I explain later).
Installation
First, let’s install everything that we can using apt-get:
$ sudo apt-get install bzr python-cairo python-django
The bzr program will be used to download the Graphite source files.
The other packages will support Graphite at runtime. Next
downloaded the source and compile Graphite:
$ cd ~/src
$ bzr branch lp:graphite
$ cd graphite
$ python ./setup.py build
$ sudo python ./setup.py install
Note: The last step will install the executables under /opt/graphite.
Next, we’ll install Whisper, the custom database that Graphite
uses:
$ cd ~/src/graphite/whisper
$ python ./setup.py build
$ sudo python ./setup.py install
Finally, let’s install Carbon. Carbon is a agent that listens for readings
and writes them to the Whisper databases:
$ cd ~/src/graphite/carbon
$ python ./setup.py build
$ sudo python ./setup.py install
Now let’s configure Carbon:
$ cd /opt/graphite/conf
$ sudo cp carbon.conf.example carbon.conf
$ sudo cp storage-schemas.conf.example storage-schemas.conf
Please note that you will probably want to reconfigure the storage-schemas.conf
file soon. We are using the defaults now because we just want to get a base
system up-and-running.
Now, since we’re still in “dev” mode, let’s make our experience a little bit
nicer by making your regular user account the owner of the /opt/grahite folder.
This will make it easier for you to do things like change config options and
restart services. Don’t worry – eventually we’re going to fix this:
$ cd /opt
$ sudo chown -R myid:myid grahite
Of course, you would replce the myid value with your login name. Now we
are ready to initialize the Whisper database. Execute the following command:
$ cd /opt/graphite
$ PYTHONPATH=`pwd`/webapp:`pwd`/whisper python ./webapp/graphite/manage.py syncdb
That last command will generate your initial databases and prompt you to
create Django user. This user account will allow you to log into Graphite,
and it is a web application user that is managed by the Django library. I
recommend creating the user, especially if you are not very familiar with
Django as a framework.
Note: Like most Django apps, you can manage this user and add others later by
visiting
http://your-graphite-url:8080/admin
OK, There’s one more configuration step that you need to run. Execute the
following:
$ echo DEBUG = True > /opt/graphite/webapp/graphite/local_settings.py
Testing
Now for the fun part. Let’s fire up the web UI:
$ cd /opt/graphite
$ PYTHONPATH=`pwd`/whisper ./bin/run-graphite-devel-server.py --libs=`pwd`/webapp/ /opt/graphite/
You should now be able to visit http://localhost:8080 and see a very nice web
application. If you’re hosting this application on a VM or separate machine,
then simple replace “localhost” with the IP address of that machine. The web
app should now be running, but there’s not really any data yet. To do that, we
need to do the following:
- Start carbon, which listens for data and writes it to the
whisper databases
- Start feeding it some data using using a test client.
Number 1 is pretty easy:
$ cd /opt/graphite
$ PYTHONPATH=`pwd`/whisper ./carbon/bin/carbon-cache.py --debug start
Now that your web app and data collection daemon are running, let’s start
feeding it some data:
$ ~/src/graphite/examples/example-client.py
This script will write create the following monitors in Graphite:
- Graphite -> system -> loadavg_15min
- Graphite -> system -> loadavg_1min
- Graphite -> system -> loadavg_5min
Clicking on a monitor shows its values in the graph. Clicking on
the same monitor again deselects it.
Note: If you’re not seeing any data immediately, don’t worry. Check
it again in 5 minutes.
The example client writes data to Graphite once per minute, so you should start
seeing results soon.
Integrating With Apache
Now that you know that Graphite and Carbon work, let’s make them both managed
services. By that, I mean that I don’t want to have to start any daemons
manually when I restart my system. Carbon and Graphite should just work. Also,
Graphite will perform much better once it is hosted on an Apache instance.
Configuration
First, let’s install the dependencies:
$ sudo apt-get install apache2 libapache2-mod-wsgi
We’re going to run our Graphite instance as a virtual host. The preferred way
of doing this on Debian-based Linux distributions (like Ubuntu) is to create a
vhost file and then enable it using the Debian Apache helpers. Lucky for us,
there’s an example vhost file called
~/src/grahite/examples/example-graphite-vhost.conf.
Execute the following commands:
$ cd ~/src/graphite/examples
$ cp example-graphite-vhost.conf graphite-vhost.conf
Now make the following changes:
- Comment out the WSGISocketPrefix line. This value will be
set in a different config file.
- Change the @DJANGO_ROOT@ value to
/usr/lib/pymodules/python2.6/django.
- If you don’t know what value to use with your ServerName
property, then just leave it as graphite.
Save your graphite-vhost.conf file and then deploy it using the
following commands:
$ sudo cp graphite-vhost.conf /etc/apache2/sites-available
$ sudo a2ensite graphite-vhost.conf
That last command creates a symlink to your graphite-vhost.conf file in
/etc/apache2/sites-enabled and then tells you if you need to restart Apache or
simply reload it. Now let’s take care of setting the WSGISocketPrefix value:
- Open the
/etc/apache2/mods-available/wsgi.conf file using your
favorite text editor.
- Uncomment the WSGISocketPrefix line an leave the default
value.
One last thing before we reload Apache. The /opt/graphite directory is still
owned by your id. You need to change everything so that is owned by the
www-data user, which is the Apache user on Debian-based systems:
$ cd /opt
$ sudo chown -R www-data:www-data grahite
Now you can finally reload Apache using the following command:
$ sudo /etc/init.d/apache reload
Testing (And A Short ServerName Tutorial)
Now you should be able to visit your Graphite site using a URL that
looks something like this:
If you know how the ServerName property in an Apache virtual host file
works, then you will have no problem visiting the site, and you can jump to the
next section. The rest of this section is for everyone else :)
If you don’t know how this property works, then you may try to test the
Graphite site by visiting one of the following URL’s:
So why can’t you see your Graphite site? Apache cares about lots of things in
your request header, but the following 3 are especially important:
- The desired server IP address
- The desired port
- The ServerName value
It uses these three values to determine which vhost it will invoke for a
request. Your request has parts one and two, but part three is simply
graphite.ip.address. Your request will therefore be handled by the default
vhost in Ubuntu, which displays the “it works” page. So we need to find a way
to add the string graphite to our request header. The easiest way to do
this is actually make the URL http://graphite point at our
Graphite server. Here’s how you can do that:
- Open up your hosts file
on your client running the web browser
- Add the word “graphite” as an alias for the machine hosting
Graphite
So, for example, let’s assume that you’re hosting Graphite on a machine that
has IP address of 10.0.0.100, and let’s assume that this machine already has an
alias of “web”. Here’s what your host file looks like before the change:
10.0.0.100 web
And here’s what it should look like after the change:
10.0.0.100 web graphite
Note: Remember, we’re making these host file changes on the client,
NOT the server.
Now, when you visit http://graphite, you should see the proper web site.
Making Carbon A Managed Service
Now that the web app is running so well, let’s “fix” carbon so that
we don’t have to manually start it each time we reboot the server.
Carbon doesn’t come with an init script, so I’ve been using the
following crude version:
#! /bin/sh
# /etc/init.d/carbon
# Some things that run always
touch /var/lock/carbon
GRAPHITE_HOME=/opt/graphite
CARBON_USER=www-data
# Carry out specific functions when asked to by the system
case "$1" in
start)
echo "Starting script carbon "
su $CARBON_USER -c "cd $GRAPHITE_HOME"; su $CARBON_USR -c "$GRAPHITE_HOME/bin/carbon-cache.py start"
;;
stop)
echo "Stopping script carbon"
su $CARBON_USER -c "cd $GRAPHITE_HOME"; su $CARBON_USR -c "$GRAPHITE_HOME/bin/carbon-cache.py stop"
;;
*)
echo "Usage: /etc/init.d/carbon {start|stop}"
exit 1
;;
esac
exit 0
Save this file as /etc/init.d/carbon, and then update rc.d using this command:
$ sudo update-rc.d carbon defaults
That’s it! You can now manage your carbon process using this script, and it
will be automatically restarted when you reboot your machine.
Password-Protecting Your Graphite Site
Let’s take stock of where we are:
- You installed Graphite and Carbon
- You integrated Carbon with Apache
- You made Carbon a managed service
You now have everything necessary to run a “real” Graphite instance. If you
don’t need anything else, then feel free to skip the rest of this tutorial. For
my needs, however, I needed one more thing. I needed to host my Graphite site
on the world wide web, and I didn’t want just anyone poking in my system
metrics. However, while Graphite may offer a Login link, it doesn’t give
you the option of setting up a login page that can block non-authenticated
access to the site.
Thankfully, there’s an easy way around this limitation. Apache gives you the
ability to block non-authenticated access to a web site using the built-in
security options. We’re going to manage security on our site using Basic
authentication.
To do this, you first need to change your graphite-vhost.conf file. Add the
following lines to the bottom of your file, just above the </VirtualHost>
line:
# Set up .htaccess security so that I can protect the site online.
<Location "/">
AuthType Basic
AuthName "Under Construction"
AuthUserFile /opt/graphite/sec/.mypasswds
AuthGroupFile /opt/graphite/sec/.mygroups
Require group managers
</Location>
Next, let’s create your AuthUserFile and your
AuthGroupFile:
$ cd /opt/graphite
$ sudo mkdir sec
$ sudo chown -R www-data:www-data ./sec
$ sudo htpasswd -c ./sec/.mypasswds some_user_name
(enter a strong password)
$ echo 'managers: tom' | sudo tee -a ./sec/.mygroups
$ sudo chmod -R 600 ./sec
$ sudo /etc/init.d/apache reload
That’s it! Now restart your browser, and you should see a dialog
box asking you to log in when visit your Graphite site.
Note: This configuration is only good enough to keep out the
riff-raff. If you have more robust security needs, then you will want to look
into using SSL.
Conclusion
I hope that some people find this tutorial to be helpful. If you
find any errors or you have any suggestions, then please feel free
to point them out in the comments.