Caffeinated Simpleton

Fourth: Regular Expressions in Clojure

Things are cruising right along now in creating my awesome twitter portal in clojure. So far we have gotten set up with compojure, started using the twitter API to grab data, and built some forms to make sure the data is relevant to the logged in user. The next little chore is to find URLs in tweets and make them into actual, clickable links. I want to keep this simple for now, so we’ll just find http:// or https:// and link that.

The Code

It turns out that the code to do this is really simple. Clojure just uses Java’s regular expression engine, but integrates it into the language a bit cleaner than Java does. A big thanks to Fatvat for basically walking me through it.

Nothing too complicated here, but there is an interesting new concept. For the first time ever, Clojure doesn’t do everything we want and we talk to Java. This is one of the most powerful attributes of Clojure. Even though it’s a young language, it’s built on a mature platform that does basically everything you need. In this case, we wanted to mutate the “text” string. This isn’t exactly kosher in a functional language, but I didn’t want to slice and dice the text when there was a perfectly usable Java method that would do the replacement for me.

Anyway, how does this work? “.replaceAll” is a method of java.util.regex.Matcher. What we’re trying to express in Java is:

In clojure, re-matcher returns matches constructed out of applying a Pattern instance to a string (“text”). So, we’re applying the .replaceAll method to the object returned by re-matcher, which is a Matcher instance created out of a Pattern (indicated by the “#” macro). This is exactly what we want, expressed in a nice, functional style. After the instance that we’re operating on, we can pass additional arguments to the method. In this case we pass the replacement string.

Another thing you might notice is the string in the urlize function definition. Clojure has extensive support for metadata, which is something that I’ve largely ignored. In it’s simplest form, you can pass a string to defn as I have done, and that will be included as the docstring. The language also includes introspection features to pull these things out, but I have yet to investigate them in depth.

Again, pretty straightforward, and now we’re starting to do some real damage. I think I’m going to dive into JavaScript and CSS for a while, but I’ll be back soon with static storage. It should be fun! As always, all the code is on github.

comments powered by Disqus