As part of my conversion to Jekyll, I lost all of my old URLs. Some of my posts are at least somewhat popular, so I wanted the URLs to continue to work, even though they are probably all long dead in Google’s search index. In any event, I’m a firm believer that no URL should ever die, least of all my own. Please ignore any dead images you find on this site.
The problem is, it’s not a straightforward nginx rewrite to go from the old url to the new one. The new URL contains the category, which the old one didn’t. So, I decided to hardcore a generated list of WordPress urls to rewrite to Jekyll URLs by getting their equivalents from MySQL, where I still have a dump of the old WordPress database.
To get the WordPress URLs:SELECT DATE_FORMAT(post_date, '%Y'), DATE_FORMAT(post_date, '%m'), post_name FROM justin_posts WHERE post_type='post' AND post_status='publish' ORDER BY justin_posts.id INTO OUTFILE '/tmp/wp.urls' FIELDS TERMINATED BY '/' LINES STARTING BY 'index.php/' TERMINATED BY '/\n';
Getting the Jekyll URLs was slightly more difficult as I had to figure out where the category lived.
This query got me the taxonomy id along with the names of the catgories:SELECT justin_term_taxonomy.term_taxonomy_id, justin_terms.name FROM justin_term_taxonomy, justin_terms WHERE taxonomy='category' AND justin_term_taxonomy.term_id=justin_terms.term_id;
But I couldn’t figure out how the taxonomy terms related to the posts. This diagram was super helpful for that.
Time to start with the joins!SELECT terms.name, DATE_FORMAT(post_date, '%Y'), DATE_FORMAT(post_date, '%m'), DATE_FORMAT(post_date, '%d'), post_name FROM justin_posts AS posts JOIN ( justin_term_relationships AS rel, justin_term_taxonomy AS tax, justin_terms AS terms) ON (posts.id = rel.object_id AND rel.term_taxonomy_id = tax.term_taxonomy_id AND tax.term_id = terms.term_id) WHERE post_type='post' AND post_status='publish' AND tax.taxonomy = 'category' GROUP BY posts.post_name ORDER BY posts.id INTO OUTFILE '/tmp/jekyll.urls' FIELDS TERMINATED BY '/' LINES TERMINATED BY '.html\n';
Boom. Jekyll urls.
Now I have two files in the same order, one with the old WordPress urls and the other with the new Jekyll urls. All I need to do now is make a nginx rewrite config. A simple python script should get us started.
#!/usr/bin/env python with open('wp.urls', 'r') as fd: wp_urls = fd.readlines() with open('jekyll.urls', 'r') as fd: jekyll_urls = fd.readlines() for redirect in zip(wp_urls, jekyll_urls): print 'rewrite "/%s" "/%s" permanent;' % (redirect.strip(), redirect.strip())
The output from that command got copied to justin.harmonize.fm and included in the nginx config. Now old links redirect to new links!
NOTE: If you have multiple categories or no category for a post, you’re hosed for those URLs, so this is far from perfect.