small medium large xlarge

Grokking Pattern Matching and List Comprehensions

Two Language Features that Rock

by Bruce Tate

Generic image illustrating the article
  Like a painter, a programmer learns gestures and techniques that make the canvas more beautiful and the work more confident.  

My mother was an artist, and a great one. She had this incredible ability to capture the essence of an idea with a few brush gestures on a canvas. She knew that she could suggest the presence of people in a crowd or flowers in a field with splatters from a toothbrush, or tall grass by scraping with a credit card. She learned these techniques from years of study of the masters, and lots of trial and error.

As programmers, we are the same. We, too, come to understand abstractions and techniques that we can use to solve similar problems. Any developer worth his salt can conjure up a set of language features, idioms, libraries, and even languages that work best to solve the problems they find day to day in the workplace. If you want to become great, though, you need to find your own particular set of gestures and techniques that will make your canvas more beautiful, and make you more efficient.

In this article, I’m going to show you two language features that I found as I was writing Seven Languages in Seven Weeks. You may find that these features are new to you, too. Whether or not you do, I hope you, as I did, find joy in the journey.

Pattern Matching

C, Pascal, Java, and C# developers will all be familiar with the case or switch statement. This feature allows a user to direct control to a code block based on the contents of a variable or expression. For example, in Java, the statement looks something like this:

 switch (message) {
 case conditionOne: codeBlockOne;
 case conditionTwo: codeBlockTwo;
 case conditionThree: codeBlockThree;
 case conditionN: codeBlockN;
 default: code_block_default;

So to print the day of the week corresponding to a variable day having an integer value between 1 and 7, you might construct a statement that looks like this:

 switch (day) {
  case 1:
  case 2:
  case 3:
  case 4:
  # ... and so on

The case statement allows more complex decisions within methods, allowing your idea to be expressed more densely than you would in, say, an if-then-else statement. Some languages allow these kinds of decisions to be made before a method or function is even invoked at all. Most functional programming languages lean heavily on recursion, so the functions must also behave differently based on the contents of the arguments. (A functional programming language is composed entirely of mathematical functions. The strictest functional languages allow no side effects, so recursion is how you do iteration.)

Let’s take a quick dive into Haskell, a strict functional language. Let’s say we wanted to take the size of a list. We first need to know a little bit about list processing in Haskell. Here are a few statements in a Haskell interpreter:

 *Main> head [1, 2, 3]
 *Main> tail [1, 2, 3]

So [1, 2, 3] is a list, the head of the list is the first element, and the tail of a list is the list of all of the elements except the head. Building a function to recursively take the size of a list is pretty easy, then. The size of an empty list is zero, and the size of any other list is one plus the size of its tail. So let’s build a function to compute the size of a list:

 module Main where
  size list = if list == []
  then 0
  else 1 + size (tail list)

That’s our program. Ignore the first line. The functional definition is in the next three.

The first three words, size list = , mean we’re defining a function called size with one argument called list. The if-then-else statement is pretty simple. If list is an empty list ([]), we return a size of zero. Otherwise, we return 1 plus the size of the tail of list.

Running the program gives you the results you’d expect:

 Bruce-Tates-MacBook-Pro:~ batate$ ghci lists.hs
 GHCi, version 6.12.1: :? for help
 Loading package ghc-prim ... linking ... done.
 Loading package integer-gmp ... linking ... done.
 Loading package base ... linking ... done.
 Loading package ffi-1.0 ... linking ... done.
 [1 of 1] Compiling Main ( lists.hs, interpreted )
 Ok, modules loaded: Main.
 *Main> size []
 *Main> size [1]
 *Main> size [1, 2]

While the program works, it is wholly unsatisfying. You have to read too deeply into the function to understand what’s happening. Fortunately, Haskell’s pattern matching lets us conditionally invoke a version of a function based solely on the function’s parameters. Here’s an alternative implementation of size that is a little easier on the eyes:

 module Main where
  size [] = 0
  size (h:t) = 1 + size t

That’s more like it. Once again, ignore the first line. The next two lines define the size function. size [] = 0 means that we’re defining a function called size. If the first parameter matches the empty set, then invoke this function, returning zero. Said another way, the size of an empty list is zero.

The second line also defines the function size. It matches the parameter (h:t), which takes a nonempty list and breaks it into a head and tail. In English, size (h:t) = 1 + size t means the size of a nonempty list with head h and tail t is one plus the size of the tail. Pattern matching accomplished a few things nicely:

  1. It simplified our functions, with each conditional execution expressed clearly. Breaking one complicated element into two simpler ones is one of the cornerstones of good software design.

  2. It allowed us to deconstruct parameters into more basic terms. In this case, we mapped a list onto elements of head and tail.

  3. It allowed us to put important complexities front and center. In this case, instead of burying an important decision in an if-then-else function, we pushed that decision up to the forefront in the form of a pattern match.

This pattern matching abstraction is becoming increasingly important, in languages such as Haskell, Scala, and Erlang. Though not all languages support pattern matching, you can still make a conscious effort to accomplish the three goals above. I will often simulate pattern matching in my Ruby programs by using return statements and one-line ifs for error conditions, like this:

 def size(list)
  raise "error" if some_precondition_is_not_met
  return( 0 ) if list.empty?

Though my language of choice does not support pattern matching, I have another idiom to apply to my canvas when I encounter methods that have grossly different behaviors based on the function parameters. Our next technique will build on the pattern matching motif.

The List Comprehension

As I wrote the Erlang chapter of Seven Languages, I wanted to see how language constructs in my favorite languages might be implemented. Many languages have a function called map, which builds a list from each of the return values of a function called on all of the elements of a list. For example, in Ruby, you could double all of the members of a list like this:

 [1, 2, 3].map {|x| x * 2}

Ruby returns [2, 4, 6].

That code takes an array, ([1, 2, 3]), and calls the map function on it. The map function takes another function ({|x| x * 2}) as an argument. The function is anonymous, meaning it has no name. The function takes a single argument, x, and returns x * 2. That’s a closure.

I could write the same expression in Erlang like this:

 3> lists:map(fun(X) -> X * 2 end, [1, 2, 3]).
 Erlang returns <ic>[2, 4, 6]</ic>.

In this case, I’m calling a function called lists:map. The first argument is a function (fun(X) -> X * 2 end), and the second is a list ([1, 2, 3]).

Map is an extremely useful tool, and one of the interesting mental exercises when working in a new language is answering the question “How, exactly, would you implement map?”

My first attempt looked like this:

 map(F, [H|T]) -> [F(H) | map(F, T)];
 map(F, []) -> [].

[H|T] means that the list is broken into the head element H the tail of the list T.

Let’s break the rest of this program down. This definition of map is recursive. The clause map(F, []) -> []. says that the map of a function F and some empty list is an empty list. This clause gives you exactly the behavior you’d expect for any function and any empty list. The second clause, map(F, [H|T]) -> [F(H) | map(F, T)];, says that the map of F over a list with a head H and tail T is F(H) plus map(F, T).

That’s a short two-line implementation of map. In the text, I asked the rhetorical question “What would you say if I told you I could write map in a short two-line program?”

The initial technical review of the book came back, and the creator of Erlang answered the question. Joe’s answer was, “I’d say Bruce Tate writes very verbose programs.” Then, he introduced me to a concept that has become one of my favorites. A list comprehension is a mathematical expression of a list. For example, if you told a mathematician to describe the function that we’ve been building with map in both Ruby and Erlang, he’d show you something like this:

  • L = { 1, 2, 3 }

  • S = { 2 · X | X is in L }

Said another way, L is the set of numbers { 1, 2, 3 }, and S is the set of elements built by the function 2 · X where X is taken from the set L. If you wanted only the numbers between, say, 1 and 3, the list comprehension can accommodate that idea, too:

  • L = { 1, 2, 3, 4, 5 }

  • S = { 2 · X | X is in L, 1 < X < 5 }

In this case, L would be { 1, 2, 3, 4, 5 }, X would come from the numbers { 2, 3, 4 }, and S would be { 4, 6, 8 }.

The list comprehension has three elements:

  • a function where the variable, or variables, represent the members of one or more input lists

  • one or more input lists

  • one or more optional predicates that filter the incoming lists

List comprehensions in programming languages are fantastic abstractions that allow you to define and transform all kinds of functions concisely and simply. In Erlang, here are a couple comprehensions:

 L = [ X * 2 || X <- [1, 2, 3] ].

In English, for X taken from the list [1, 2, 3], build a list of X * 2.

 L = [ X * 2 || X <- List, X > 1, X < 5].

For each X taken from List that is greater than 1 and less than 5, build a list of X * 2.

You can manipulate just about any list in this way. For a shopping cart tuple with (item, price, quantity), you could add a tax item to the tuple. Or, you could transform the points of a polygon to reflect them vertically or horizontally. By pulling numbers from a list, you can compute all combinations effortlessly, and reduce those combinations with filters.

You’re not limited to dealing with expressions. You can deal with functions, too. After all, what is a map but a list comprehension? You can define map like this:

 map(F, L) -> [ F(X) || X <- L ].

That’s it. map(F, L) means “For every X taken from list L, build a list of F(X).”

List comprehensions show up in many different languages, including Erlang, Python, Scala, Clojure, and Haskell. But here’s the point. After seeing list comprehensions several times in Seven Languages, my Ruby programming has changed. Now I find myself often chaining selects and maps together to give me a similar effect. For example, to double all of the even elements in a list, I might do something like this:

 [1, 2, 3, 4, 5].select(&:even?).map {|x| x * 2}

It’s not a full list comprehension, but it’s close. We capture the spirit of the language feature, and we have another gesture for plopping just the right picture onto our canvas.

Wrapping Up

Part of the struggle of writing Seven Languages was my ignorance. I had no idea about the breadth and depth of languages as vastly different as Clojure and Io. I gladly confess that I made some pretty spectacular messes along the way, especially in Prolog and Haskell. But there’s a certain catharsis that comes with failing so spectacularly. I can often understand that my own programming paradigms are inadequate for some of the challenges I might face.

If you, too, like the idea of exploring language features like these, let me know. If there’s sufficient interest, I’d like to see your suggestions for new features that we might cover in future articles.

My book Seven Languages in Seven Weeks formed the foundation for this article. Joe Armstrong’s Programming Erlang has some excellent coverage on list comprehensions. And the Haskell Wiki has some excellent discussion of Haskell’s type system and pattern matching, including a Cookbook for Pattern Matching.

Bruce Tate is a kayaker, mountain biker, and father of two from Austin, Texas. An international speaker and prolific author, Bruce’s primary focus through the years has remained steady: rapid development of web applications with small, effective teams. He is the chief architect behind several commercial websites including,, and most recently, His books through the years have included the Jolt-winning Better, Faster, Lighter Java and the controversial Beyond Java. He is also the author of From Java to Ruby and Seven Languages in Seven Weeks.

Send the author your feedback or discuss the article in the magazine forum.