Pretty image
Elixir is a modern, functional programming language designed for high availability and concurrency. It has Ruby-like syntax married to the power and reliability of the Erlang VM. If you wanted to get into functional programming but were put off by the academic feel, now’s the time to jump in.

Last month, we looked at the basics of pattern matching, and saw how it is universal in Elixir—it is the only way to bind a value to a variable or parameter. But pattern matching really shines when we apply it to functions.

Anonymous Functions

Elixir has a nice set of built-in modules. One of these, Enum, lets you work on enumerable collections. One of its most commonly used functions is map, which applies a transformation function to a collection, producing a new collection. Let’s fire up the Elixir interactive shell, iex, and try it.

 iex> Enum.map [2,4,6], fn val -> val * val end
 [4,16,36]

The first argument we pass to map is the collection; in this case a list of three integers. The second argument is an anonymous function.

Anonymous functions (I’m going to call them fns from now on) are defined between the keywords fn and end. A right-facing arrow, -> separates a list of zero or more parameters on the left from the body of the function on the right.

We passed map the function fn val -> val*val end. This fn takes a single parameter, val, and the body of the function multiplies that value by itself, implicitly returning the result.

A fn is just another Elixir value, so we also have written this code as:

 iex> square = fn val -> val * val end
 #Function<erl_eval.6.17052888>
 iex> Enum.map [2,4,6], square
 [4,16,36]

You can call fns using something like a regular function call:

 iex> square.(5)
 25

The period and the parentheses are required.

Boy, That’s Way Too Much Typing!

No, it’s not, and we don’t reward whining around these parts anyway.

But, that said, Elixir does have a shortcut. (What follows is for Elixir 0.9. The syntax is likely to change in the near future.) Let’s jump straight to the code.

 iex> Enum.map [2,4,6], &1*&1
 [4,16,36]

When Elixir sees &[1-9], it knows that it needs to generate an anonymous function. The function will have from 1 to n parameters, where n is the highest numbered &n value. The body of the function is basically the expression containing the ampersands, with each ampersand value being mapped to the corresponding parameter. So, &1*&1 is logically the same as fn p1 -> p1*p1 end, and rem(&1,&2) becomes fn p1,p2 -> rem(p1,p2) end.

Because fns are just values, you can even write the code as:

 iex> square = &1 * &1
 #Function<erl_eval.6.17052888>
 iex> Enum.map [2,4,6], square
 [4,16,36]

This is a fantastically useful shortcut, but there is a gotcha. When deciding what code to make the body of the fn, Elixir starts at the ampersand term and looks up the parse tree for the immediately enclosing expression. With &1*&1, it’s the multiplication operator. With rem(&1,&2), it’s the function call to rem.

But things go a little haywire if you write &1+&2+&3. That’s because, internally, this is represented as (&1+&2)+&3. Elixir translates the expression containing &1 and &2, so now we have (fn p1,p2 -> p1+p2)+&3. Elixir then complains that &3 can’t exist without a preceding &1 and &2. (And this is why the syntax is likely to be overhauled).

Anyway, the moral is: use the ampersand syntax for simple expressions, and regular fn…end for the rest.

Named Functions

Anonymous functions tend be be used for callback work—they are typically short and localized in their use. Named functions, however, are where the real work gets done.

Named functions can only exist inside Elixir modules. Here’s an example:

 defmodule AsciiDigit do
  def valid?(character) do
  character in ?0..?9
  end
 end
 
 IO.inspect AsciiDigit.valid? ?4 # => true
 IO.inspect AsciiDigit.valid? ?a # => false

To follow this code, you first have to know that the syntax ?x returns the integer character code for x (so ?0 is 48).

Our example defines a module called AsciiDigit containing a single function, valid?. This takes a character code and returns true if it is a digit from 0 to 9. We use the range operator .. to define the first and last valid character, and the in operator to check for inclusion.

Elixir supports pattern matching when determining which function to run. You can use def multiple times for the same function, each with a different pattern of parameters. Elixir will dynamically choose the one where the parameters match the arguments passed.

An oldie-but-goodie shows this nicely. The specification for the Fibonacci sequence (Brian Hogan groans) is

 fib(0) = 1
 fib(1) = 1
 fib(n) = fib(n-2) + fib(n-1)

This isn’t code—it’s a mathematical definition. However, we can turn it into code with almost no effort:

 defmodule Fibonacci do
 
  def fib(0), do: 1
  def fib(1), do: 1
  def fib(n), do: fib(n-2)+fib(n-1)
 
 end
 
 Enum.map 0..10, Fibonacci.fib(&1) #=> [1,1,2,3,5,8,13,21,34,55,89]

(I know, I know—this is not an efficient way of calculating the series. I want to show Elixir, and not worry about faster algorithms.)

Despite appearances, there’s just one definition of the fib function in there. It just has three heads—three patterns of arguments that select different bodies.

The first two heads select the cases where the argument is 0 or 1. They use the abbreviated form of the body, do: expr to return 1. The third form is the recursive step. If neither of the first two match, the third one executes.

What happens if we pass our function a negative argument. Right now, it will loop until we run out of stack or patience—subtracting 1 or 2 from a negative number will never reach 0. Fortunately, Elixir has guard clauses, which allow us to put additional constraints on pattern matching.

 defmodule Fibonacci do
 
  def fib(0), do: 1
  def fib(1), do: 1
  def fib(n) when n > 1, do: fib(n-2)+fib(n-1)
 
 end
 
 Fibonacci.fib(10) #=> 89
 Fibonacci.fib(-10)
 # => ** (FunctionClauseError) no function clause matching in Fibonacci.fib/1

Now, when we call fib with a negative number, Elixir can’t find a function clause that matches, and so it raises an exception. If you really wanted to, you could handle this case in code, giving a more application-specific error:

 defmodule Fibonacci do
 
  def fib(0), do: 1
  def fib(1), do: 1
  def fib(n) when is_integer(n) and n > 1, do: fib(n-2)+fib(n-1)
  def fib(x), do: raise "Can't find fib(#{x})"
 
 end
 
 Fibonacci.fib(10) #=> 89
 Fibonacci.fib(-10) #=> ** (RuntimeError) Can't find fib(-10)
 Fibonacci.fib("cat") #=> ** (RuntimeError) Can't find fib(cat)

We extended our guard clause to check that the parameter is an integer, and then added a 4th function head that accepts any parameter and reports an appropriate error.

Verifing Credit Card Numbers

Most of the long number strings we deal with every day (credit card numbers, IMEI numbers in your phone, and so on) have a check digit. This is normally the final digit of the number, and it is calculated using some algorithm that combines all the previous digits. So, when you enter your credit card number, the web page can recalculate the check digit, and verify that it is the same as the last digit in number you gave. It isn’t a check against fraud; it’s simply a quick way of picking up typos.

Probably the most widely used technique is the Luhn Algorithm. It reverses the digits in the number, and splits them into two sets: digits at odd positions in the string, and digits at even positions. It sums the odd digits. For the even digits, if multiplies each by two. If the result is 10 or more, it subtracts 9. It then sums all the results. Adding the sum of odd and even positions will yield a result that’s divisible by 10 for valid numbers.

When I first started with Elixir, my head was still full of conventional ways of doing things. As a result, I’d write something like the following:

 defmodule CheckDigit do
 
  import Enum
 
  def valid?(numbers) do
  numbers = reverse(numbers)
  numbers = map(numbers, fn char -> char - ?0 end)
  numbers = map(numbers, fn digit, index -> {digit, index} end)
  { odds, evens } =
  partition(numbers, fn {_digit, index} -> rem(index, 2) == 0 end)
  sum_odd = reduce odds, 0, fn {number, _index}, sum -> sum + number end
  sum_even = reduce evens, 0, fn {number, _index}, sum ->
  result = number * 2
  if result >= 10 do
  result = result - 9
  end
  result + sum
  end
  rem(sum_odd + sum_even, 10) == 0
  end
 
 end

Ugh! Let’s step through it (hopefully you’re wearing boots).

The Enum module has lots of functions for dealing with collections. We’ll be using many of them in this code, so we import the module. This means we can write map instead of Enum.map.

Our valid? function is passed a list of UTF-8 digits. By coincidence, that’s exactly what the single quoted string literal generates.

Using the description of the Luhn algorithm, we reverse the digits, and then convert the UTF representation to the actual integer value (so ?1, which is 41, gets mapped to 1). At this point, given the argument '123', we’d have a list of integers [3, 2, 1].

Now it gets messy. We need to partition the digits into those on an even position and those at an odd position. To prepare to do that, we use map, passing it the function fn number, index -> {number, index} end. This function takes the actual digit value, along with its index in the list, and maps it to a tuple containing each.

At this point, alarm bells should be ringing. This is just too damn hard. But we plow on, because that’s what programmers do.

The partition function takes a collection and a function. It returns a tuple where the first element is a list of values for which the function returned true, and the second element is the rest of the values.

Now we have to sum the odd values. Whenever you need to reduce a collection to a single value, you’ll probably want to use the reduce function. It takes the collection, an initial value, and a function. This function receives each element of the collection in turn, along with the current value. Whatever the function returns becomes the next current value. So, summing a list of numbers can be done with

 Enum.reduce list, fn val, sum => val + sum end
 # or
 Enum.reduce list, &1 + &2

But what we have is a list of {value, index} tuples. This means we need to use pattern matching on the first parameter of the function to extract just the value. (The underscore in front of _index means we’re ignoring this field.

Summing the even numbers is similar, but we have to do the doubling, and the conversion of numbers ten or above.

At the end of all this, we can test this in iex. I’m using a standard Visa test credit card number here, so don’t go booking a trip to Tahiti using it.

 $ iex validate_cc.exs
 iex> CheckDigit.valid? '4012888888881881'
 true
 iex> CheckDigit.valid? '0412888888881881'
 false

Tidying Up

Our solution works, but the style isn’t very functional. (That’s a polite way of saying it’s butt ugly). To tidy it up, I look for places where there’s clearly something wrong, and see if I can fix them.

The first problem I see is the first three lines. I’m transforming the given number into a reversed set of digits, each with an associated index.

The word transform is the clue. Functional programming is all about transforming data. It’s so important that Elixir has a special operator, |>. This lets us build pipelines of functions, where each transforms the results of the previous. It lets us compose functionality.

Using the pipeline operator, we can rewrite the first three lines as

 numbers
  |> reverse
  |> map(fn char -> char - ?0 end)
  |> map(fn digit, index -> {digit, index} end)

We take the original list, transform it by reversing it, then by converting character codes to integers, and then by adding the index.

The pipeline operator looks like magic, but it’s actually quite simple. It takes the value of the expression on its left, and inserts it as the first argument of the function call on its right, shifting all the other arguments down.

So the second ugliness is all this partitioning and summing. Our problem is that we’re thinking imperatively, not functionally. We are telling Elixir each step of what to do, when instead we should be thinking of the specification of what we want, and let it work out the details.

Think back to our Fibonacci example. There we implemented our specification as three function heads, which matched the two special cases and the one general case. Can we do the same here?

Rather than processing our list of digits one element at a time, what if we process it in twos? This means we’re working with a pair of digits—the first will be the one at an odd position, the second at the even position. We know how to calculate the Luhn number for these two digits, and then we can add the result for them into the result for the remainder of the list. That’s our recursive step.

When we finally empty the list, we will have calculated the required sum, so we can simply return it.

There’s one other case to consider. If the list has an odd number of digits, then when we get to the end, we’ll only have a single element. But we know that element is at an odd position, so we can simply add it to the accumulated sum and return the result.

So, here’s the new version of our code:

 defmodule CheckDigit do
 
  import Enum
 
  @doc """
  Determine if a sequence of digits is valid, assuming the last digit is
  a Luhn checksum. (http://en.wikipedia.org/wiki/Luhn_algorithm)
  """
 
  def valid?(numbers) when is_list(numbers) do
  numbers
  |> reverse
  |> map(&1 - ?0)
  |> sum
  |> rem(10) == 0
  end
 
  defp sum(digits), do: _sum(digits, 0)
 
  defp _sum([], sum), do: sum
  defp _sum([odd], sum), do: sum + odd
  defp _sum([odd, even | tail], sum) when even < 5 do
  _sum(tail, sum + odd + even*2)
  end
  defp _sum([odd, even | tail], sum) do
  _sum(tail, sum + odd + even*2 - 9)
  end
 
 end

The pipeline at the top is now a lot simpler—there’s no messing with indexes, and no temporary variables. It reads like a code version of the spec.

The sum function is an example of a common pattern. We need to set the initial value of the value we’re summing, but we don’t want the code that calls us to know about that detail, so we write a version of sum that just takes the numbers, and then calls the actual implementation, passing in the list and a zero for the initial value. We could give the helper functions the same name, but I prefer using _sum to differentiate them. (Many Elixir programmers would have called them do_sum, but that always strikes me as being too imperative.)

The _sum function has 4 heads:

  • If the list is empty, we return the sum that we’ve been accumulating in the second parameter.

  • If the list has one element, add its value to the sum so far and return it. This is the terminating condition for a list with an odd number of elements.

  • Otherwise, we extract the first two elements from the list. This uses the pattern [odd,even|tail]. The first element is bound to odd, the second to even, and the remainder of the list is bound to tail.

    Looking back at the Luhn algorithm, we have two cases to consider. If the result of multiplying the even number by two is less than ten, then that’s the number we add into the sum. We use a guard class to check for this.

  • Otherwise we have to subtract nine from the product. That’s what the 4th function body does.

Notice how we’re passing the updated sum around as the second parameter to the function—this is a universal pattern when you want to accumulate a value or values across a set of recursive function calls.

What’s Different About this Code

When you write in a language such as Java, C#, or Ruby, you’re working at a number of levels simultaneously. Part of your brain is thinking about the specification—what has to get done. The other part is thinking about the implementation—the nuts and bolts of how to do it. And that’s where things often get bogged down.

Look at the last example. We’re iterating over a set of digits. We’re selecting those with odd or even positions. We’re performing conditional calculations. We’re summing the result. And there isn’t a single control structure in the program. No ifs, no loops. The code pretty much reflects the specification of what we want to happen.

And that’s one of the reasons I’m a fan of functional programming in general, and Elixir in particular.

Next Month

Let’s start getting parallel. We’ll see how we can use Elixir to run hundreds of thousands of processes, and how to coordinate their work.

While you’re waiting, check out the screencasts available on my Elixir book’s home page.

And if you want the real scoop, you can always buy the book. :)

(Fibonacci image from http://www.maths-rometus.org/)

Dave Thomas is a programmer who likes to evangelize cool stuff. He cowrote The Pragmatic Programmer, and was one of the creators of the Agile Manifesto. His book Programming Ruby introduced the Ruby language to the world, and Agile Web Development with Rails helped kickstart the Rails revolution.

Send the author your feedback or discuss the article in the magazine forum.