This issue begins a series on the Clojure language by Michael Bevilacqua-Linn.
Welcome to the first in a series of Pragmatic articles on Clojure, a dynamically typed, practical programming language that targets the JVM and other modern runtimes. Clojure is a language in the Lisp tradition, and in this article we’ll examine one of the things that makes Clojure, and other lisps, special.
Lisp is the second oldest high-level programming language. It was originally created in 1958 by John McCarthy, and has gone through over 50 years of evolution. One of the most recent branches of this evolution is Clojure, a fairly new language targeted at working programmers.
Newcomers to Lisp, Clojure newbies included, are often put off by what seems like a strange syntax. The parentheses are in different places! Oh my!
Given that this syntax is an obvious barrier to widespread adoption, why would anyone decide to create a new Lisp in this day and age?
It turns out that the choice of syntax isn’t arbitrary. It enables the most powerful metaprogramming system yet created. It’s powerful enough that the majority of the language is implemented using it. Putting it another way, a Clojure developer has the power of a compiler writer at their fingertips.
In this article, we’ll introduce this system and see how it’s related to Clojure’s interactive programming environment, the Read Eval Print Loop or REPL.
You know what a REPL is: A user of the REPL types in some Clojure code. The REPL then reads it in, turning it from a string into another data structure. That data structure is then evaluated to produce a value, which is printed. Finally, the REPL will loop back to the beginning, waiting for new input.
The REPL is a good place to get a feel for Clojur. Let’s start off in the classic style by running hello, world in the REPL.
If we’d like to add two numbers together, the syntax looks the same. Here, we add 21 and 21.
Even creating a function follows the same syntax. Here, we create a say-hello which just prints "hello, pragmatic programmers".
There’s one interesting difference between these examples. In the first example, we saw "hello, world" printed, followed by nil. In the second two, there was no nil, we only saw the results 42 and #'matters/say-hello, respectively.
The Eval in REPL takes our code and executes it. Evaluating a bit of code will always produce a value. Since a call to println has no interesting value, it’s being executed only to print something, nil is returned. Our other two examples do have interesting values, the value of two numbers added together and the name of a function we just defined.
Let’s dig into the notion of evaluation in a bit more detail. We’ll build up a simple model of how it works. Most things in Clojure evaluate to themselves. For instance, here we evaluate the integer 1 and string "foo" in the REPL.
Some things don’t evaluate to themselves, like the calls to println and + we saw earlier. With those, the arguments were first evaluated and then passed into the println function or + operator.
This is a bit more clear if we nest some calls, as we do below. First (* 10 2) is evaluated to get 20, then (+ 22 20) is evaluated to get the final value of 42.
We can nest these calls arbitrarily deep, by adding one more layer in the following snippet.
Occasionally, it may be handy to turn off evaluation. We can do so by prepending our snippet of code with a single quote, as we demonstrate here.
Now that we’ve got a better idea of what evaluation is, let’s take a closer look at what’s getting evaluated. When we type something into the REPL, we’re typing in a series of characters, a string. This isn’t what ultimately gets evaluated by Clojure. Instead, these characters are first passed into the R in REPL, the reader.
The reader takes a series of characters and turns them into some other data structure. To understand this a bit better, let’s take a quick detour into a couple of Clojure’s built-in data structures: vectors and keywords.
Keywords are used much as we would use a keyword in Ruby or an enum in Java, and are prepended with a colon.
Vectors give us fast positional access to their elements. They can be created by placing the elements of the vector inside of square braces. We create a vector and name it some-keywords in the following snippet.
We can use first to get the first element of a vector.
In the preceding example, the actions of the reader take place behind the scenes, as part of the REPL. Let’s make things a bit more explicit by using read-string. This takes a string directly and reads it. Here, we’re using it to read in a new vector and name it some-more-keywords.
We can treat it just as we did our original vector.
So far, the reader might remind you of something like Json or YAML. It takes a string and turns it into some more complicated, probably nested, data structure. That’s not far off, but something about it might strike you as odd. Here I am claiming that the Read in REPL reads in data that we can manipulate in our code, much like a Json or YAML parser would.
But aren’t we typing code into the REPL? How does that work?
To find out, let’s take a look at another Clojure data structure, the list. In Clojure, as in other Lisps, a list is a singly linked list. One way to create a list is to use list, as we do in the following code snippet.
Another way is to simply enclose the list elements in round braces. Here we do that using read-string this time, just as we did with our earlier vector.
These two lists are equivalent.
Let’s take a look at another list. Here, we create a list with three elements, the symbol + and the integers 21 and 21.
And here, we use the first function to get the first element.
Our first two list examples just contain keywords; our final one obviously contains code! Clojure code is just Clojure data, a property known as homoiconicity. The evaluation rule that we hinted at earlier for function calls is actually the evaluation rule for lists. We can see this by evaluating funky-looking-list manually, as we do in the following snippet.
Because Clojure code is just Clojure data, we can manipulate it just as we would any other data. This gives us, the humble application- or framework-programmer, an incredible amount of power.
To see how, we’ll need to understand Clojure’s macro system. A macro is a special kind of function. It’s intended to take a piece of data which represents code, also known as a form. A macro transforms one form into another before Clojure’s compiler compiles it. Finally, the evaluation rule for a macro is special in that a macro does not evaluate its arguments.
Let’s take a look at a simple macro. This macro takes two arguments, a name and a string to print. It then creates a function that prints the passed-in string.
Here we’ll use it to create a function named foo.
If we’d like to see what this macro expands out to, we can use macroexpand-1 on a call to it, as we do in the following code.
In make-printer we constructed the list that our function definition consists of using list and '. Clojure has a feature that makes this easier, syntax quote, represented by a single backtick.
Syntax quote is much like regular quote. The main difference is that it allows us to turn evaluation back on inside of it using unquote, represented by a tilde. In addition, syntax quote will fully qualify any symbols it comes across, which helps avoid a common pitfall in macro writing known as unintentional name capture.
Here, we’ve got a simple use of syntax quote. As we can see, it evaluates the inner forms (+ 1 2) and (+ 3 4) as we’ve applied unquote to them, but leaves the outer form unevaluated.
Syntax quote is useful because it allows us to write macros that look like templates for the code that they’ll generate. For instance, here’s our make-printer rewritten to use syntax quote.
And here’s what it expands out to.
Much of Clojure’s core functionality is built using macros. For instance defn expands to def and fn, as we show below.
In summary: Clojure code is just Clojure data. We can use the macro system and syntax quote to write code templates that look like the code they generate. This makes macroprogramming, an inherently difficult activity, about as easy as it’ll ever get. In fact, the macroprogramming so enabled is powerful enough that much of Clojure’s functionality is implemented using it.
Next month, we’ll examine another thing that makes Clojure special. Clojure has a unique, intuitive view on state and identity that make it ideal for concurrent programming. Thanks for reading. I’m looking forward to next month!
Michael Bevilacqua-Linn has been programming computers ever since he dragged an Apple IIGS that his parents got for opening a bank account into his fifth grade class to explain loops and variables to a bunch of pre-teenagers. He currently works for Comcast, where he builds distributed systems that power infrastructure for their next generation services, and wrote Functional Programming Patterns for the Pragmatic Bookshelf. He tweets occasionally at @NovusTiro.