small medium large xlarge

Thinking Functionally with Haskell

Is There Anything Left to Say about Monads?

by Paul Callaghan

Generic image illustrating the article
  In which Paul explores some powerful ideas about plumbing.  

Apparently, there’s a view that Haskell is 99% monads (maybe more), and that monads are some arcane mystical concept that only a few can master. Wrong and wrong.

My aim this month is to talk a bit about why patterns like monads are useful, particularly explaining what such patterns means for programming and how to use them to keep your code under control—but not to over-use them.

I Remember the Time before Monads

Haskell and similar languages did exist before these ideas were applied to programming, so there really was a time before monads. It was not a barren wasteland, where all we could do was write programs to manipulate trees (or burn electricity), and had no interaction with the outside world at all.

We really could do real-world stuff, including file operations, console IO, and IPC, though it was a bit clumsy in places. Around that time, I was doing my PhD in the context of a large Natural Language Processing system, around 60K lines of Haskell—so one of the largest functional programs of its time. The program could process and analyze a Wall Street Journal article in a few seconds, build a complex semantic representation of the story, output various summaries of it, and allow interactive Q&A on the contents. It didn’t use a single monad.

It was, however, a time of exploration, when researchers explored various ideas to find a good way of both having our cake and eating it. Monads are one of the solutions they found, and essentially gave us a small but flexible API for working with computations (like IO operations or state modifications, or various combinations thereof) as opposed to simple data values, and did so elegantly within the standard langage. It got even better when syntactic sugar was added. This point of operating within the language is important: avoiding ad hoc extensions does help to keep the language simple.

This simple idea provided an excellent structuring pattern to tame a lot of clumsy code, and even more useful, gave us a solid framework for exploring more powerful ideas.

So monads are highly useful for some aspects of programming work, but they are certainly not an essential or core part. You will probably find that most large, well-written Haskell programs contain about 50-80% code that does not involve monads at all—the bulk is just pure data manipulation. Of the remainder, the monad use is mostly straightforward and follows certain common idioms. Real scary stuff is pretty rare.

Monads Are One Kind of Plumbing

Remember what we’re trying to do as (functional) programmers: use the language in full to let us program in a clear and direct way.

What kinds of things get in the way? Put another way, have you written code and wished that you could abstract out certain “noise” elements to leave a clearer “signal” for the key operations of your code? Here are a few of the noisy elements:

  • verbose syntax

  • (frequent) type annotations

  • error handling

  • passing common arguments

  • threading mutable state

  • handling multiple results

Haskell & Co. already score well on the first two, as we explored in past months. In a functional context, dealing with the other items boils down to needing more flexible ways to join operations together, or to compose them. For example, we can write and understand (ruby-style) code like this:

 if (res_1 = do_step_1).nil?
 elsif (res_2 = do_step_2(res_1)).nil?
  use_vals(res_1, res_2)

But wouldn’t it be preferable to write something like the following, which makes it quite clear that the operation is a particular sequence of steps that uses the intermediate values in a certain way? And let all of the error checking be handled behind the scenes?

 res_1 = do_step_1
 res_2 = do_step_2(res_1)
 use_vals(res_1, res_2)

Of course, we can hide some of the error checking with conventional exceptions, but exceptions can be a bit of overkill sometimes, i.e. a sledgehammer. Plus, they only deal with error handling and none of the other phenomena we might want to hide away, like state handling. Is there a more general mechanism we can use?

Basically, yes, and monads are one of several ideas that can be used. Let’s start by revisiting the idea of a pipeline of transformations first.

The Simplest Form of Pipeline

A few of my examples have already used pipeline-style code, like

 take 3 $ map reverse $ words "one two three four"

using $ to chain several transformations together. Recall that you can read such pipelines in either direction, e.g. take 3 of the reversed words in the string... The $ operator is also a close relative of . (function composition), though it’s easier initially to explain things in terms of $. It is defined as follows, including the precedence and associativity declaration and the type signature for reference.

 infixr 0 $
 ($) :: (a -> b) -> a -> b
 f $ x = f x

As mentioned earlier, the definition is boring—applying its function argument to its other argument, and the main trick of $ comes from its declaration as a right-associative operator, which means we can write f $ g $ h x instead of f (g (h x)). The precedence level of 0 means it has the lowest priority level, so reverse $ "a" ++ "b" means reverse ("a" ++ "b") and not (reverse $ "a") ++ "b".

Sometimes it is handy to write the pipeline the other way round, so let’s define as an alternative to $—exactly the same as above but the parameters are in a different order.

 infixl 0 €
 (€) :: a -> (a -> b) -> b
 x € f = f x

The initial example can now be written in a more OO-chaining style as

 "one two three four" € words € map reverse € take 3

This operator has been declared as left-associative and lowest priority, hence the above is parsed as a left-nested

 ((("one two three four" € words) € map reverse) € take 3)

which is the grouping that makes sense.

Passing Values along the Pipeline

Can the conditionals example be written in this style too? Basically yes, though the is too simple and we’ll need something else—let’s call it £ for now. This £ needs some way to grab the intermediate result values and pass them along for use later. Conveniently, lambdas (anonymous functions) are an excellent fit for this! So how about this:

 do_step_1 £ \res_1
  -> do_step_2 res_1 £ \res_2
  -> use_vals (res_1,res_2)

To understand how the lambdas help here, we need to understand a key detail of lambda syntax: the important rule is that in \var -> ..., the expression after the arrow extends as far to the right as possible, and the parameter name is available throughout the body of the function, even inside nested functions (unless the name gets shadowed). Technically, it’s not an operator like + or $, but you can think of it as acting like a right-associative operator with a low precedence of (-1). So the above parses as a right-nested tree:

 do_step_1 £ (\res_1
  -> (do_step_2 res_1 £ (\res_2
  -> (use_vals (res_1,res_2))))

Or in tree form, the parse result looks like this:

 £ ---- do_step_1
  +-- \res_1 -> £ ---- do_step_2 res_1
  +-- \res_2 -> use_vals (res_1,res_2)

The top £ node has a big function as its second argument, and the function grabs the input (from the first argument) and uses it in various places in the body of the function: as input to do_step_2 and also again when computing the result value. The inner function gets the result from do_step_2 res_1 under the name of res_2 and then returns the combined result.

So, we’ve seen that we can express code in an imperative-ish style using a suitable operator, and that the anonymous function support in Haskell helps to tie inputs and outputs together in the pipeline. Next, we need to consider what we’re hiding and how we’re hiding it, which means, what is the definition of £ going to be?

A Pipeline for “Conditional” Code?

The glue we need depends on what the component operations actually do, which depends also on their types.

The original Ruby-style code suggested operations that return a value or nil. Thankfully, Haskell has no notion of nil—we only deal with values and it’s impossible to not have a value. (Franklin Chen has a great talk on “various aspects of nil - definitely worth a look. Do you know what the “billion-dollar mistake” is?) The Haskell approach is to use an “option type” to distinguish between a value being useful/present, or not. The standard version of this is Maybe. Some people (including me) see it as a kind of box: it is either empty (Nothing) or full (Just something), and the polymorphism allows it to contain values of any type. The full type also reflects what kind of thing is in the box, e.g. Just "foo" has type Maybe String—i.e., a String in the box.

 -- already defined in the Prelude
 data Maybe a = Nothing | Just a

This Maybe type is ideal for representing the return results from operations which may or may not return a useful value. For example, looking up a key in a table may or may not find a value. Haskell’s prelude contains a simple function called lookup for small tables, e.g. lookup 'a' [('a', 10), ('b', 20)] returns Just 10 but lookup 'c' [('a', 10), ('b', 20)] returns Nothing.

 lookup :: Eq a => a -> [(a, b)] -> Maybe b
 lookup k [] = Nothing
 lookup k ((x,y):xys) | k == x = Just y
  | otherwise = lookup k xys

Maybe has other uses too, like representing when some component of a data structure is optional, e.g. someone’s age can be Nothing or Just 42. When using such values, the extra detail ensures that you handle the 'nil' cases properly: you can’t treat a Maybe Int value as an Int—it has to be unpacked explicitly. There are no null pointer exceptions. Ever.

First Attempt

On the other hand, the extra wrapping does sometimes get in the way, but that’s what we’re looking to do here: find some definition of £ which hides the annoying details away. Let’s think types first. We have (€) :: a -> (a -> b) -> b but now want a version that weaves Maybe in there as well.

Intuitively, we want Maybe around both arguments and around the result. What happens if we try (£) :: Maybe a -> Maybe (a -> b) -> Maybe b? And when both arguments are Just, then go ahead and apply the function (line 1). Otherwise, for any other combination of inputs, return Nothing (line 2).

 (£) :: Maybe a -> Maybe (a -> b) -> Maybe b
 Just x £ Just f = Just (f x) -- line 1
 _ £ _ = Nothing -- line 2

We can test with a few expressions, e.g. Just "foo" £ Just reverse gives Just "oof", whereas Nothing £ Just reverse and Just "foo" £ Nothing give Nothing.

It’s encoding the idea that the calculation should only proceed when both sides are OK, but is it what we want? Unfortunately, almost but not quite, because it doesn’t fit the pattern we were aiming for from above (do_step_1 £ \res_1 -> do_step_2 res_1 ...) because the lambda is floating outside the second step and not wrapped inside it—which means it won’t be so easy to pass values down this pipeline.

So what have we invented? Briefly, it’s something between a functor (which represents mapping) and a monad, and is useful enough to have a name and a standard library: it’s an Applicative Functor.”

Recall that functors signify the general idea of mapping over a container data type (e.g. mapping on lists, mapping on trees). For Maybe it means the following:

 fmap_maybe :: (a -> b) -> Maybe a -> Maybe b
 fmap_maybe f Nothing = Nothing
 fmap_maybe f (Just x) = Just (f x)

So, fmap_maybe (\x -> x + 1) Nothing gives Nothing, and fmap_maybe (\x -> x + 1) (Just 10) gives Just 11. It’s a useful operation to do something to the value if one is there, else to leave the box empty.

Applicative functors support slightly more functionality by bridging or chaining between two values. Notice how the definition of £ differs from fmap_maybe—and in particular, notice that fmap_maybe f v is equal to v £ Just f, i.e. you can define mapping via the slightly stronger concept, but can’t go the other way. (This isn’t just theory—it’s an illustration of how one concept is stronger than the other and so can be used for more things.) A similar relationship exists between applicative functors and monads.

Functors, Overloading, and Constructor Classes

Some technical details of Haskell now, which make concepts like functors and monads more convenient in use. Let’s talk overloading. Haskell’s type class system allows overloading of names based on the types in play, and this is used to avoid an explosion of names like fmap_maybe, fmap_either, ... for things that are inherently the same concept and have the same pattern in their types.

Mapping is a key notion for all container-like types, where a function of type a -> b can be applied to some structure of type f a to yield f b, retaining the shape of the input data but applying the function to all values it contains. In concrete terms, fmap show (Just True) gives Just "True", or fmap show [1,2,3] gives ["1", "2", "3"]. The following code, already included in the Prelude, first introduces the functor type class as an interface with one member, then declares how mapping works on lists and Maybe values.

 class Functor f where
  fmap :: (a -> b) -> f a -> f b
 instance Functor [] where
  fmap = map
 instance Functor Maybe where
  fmap f Nothing = Nothing
  fmap f (Just x) = Just (f x)

Notice the switch from overloading on actual types (like Int) to overloading on type constructors like Maybe, which are not types themselves but can be used with actual types to construct new types, e.g. Maybe Int or even Maybe (Maybe Int). The general principles for these “constructor classes” are very similar to those for type classes, and allow fmap to be used as freely as other overloaded functions like show, and so on.

Behind the scenes, the overloading mechanism uses the types to select which of the various definitions of fmap to use. For example, fmap (fmap show) [Just 1, Nothing, Just 2] works on a list of Maybe Int values, and selects the list version for the outer map and the Maybe version for the inner map, thus gives [Just "1", Nothing, Just "2"].

Applicative functors are also captured as a type class. Our £ operator becomes the overloaded operator <*> (note the change of argument order), and each suitable container type can have its own custom implementation of it. The class interface also contains pure :: a -> f a, which represents the conversion of a simple value into the “wrapped” type, e.g. pure for Maybe means wrapping the value with Just, and pure is the way we get simple values into the pipeline.

 -- from the Control.Applicative standard library
 -- use "import Control.Applicative" to include it
 class Functor f => Applicative f where
  pure :: a -> f a
  (<*>) :: f (a -> b) -> f a -> f b
 instance Applicative Maybe where
  pure x = Just x
  Just f <*> mx = fmap f mx
  Nothing <*> _ = Nothing

A simple example now: a computation that requires two table lookups before it can combine the result. We want to add the values if both are OK, else return Nothing (because one part of the operation failed). Using pure to put the add function in the chain, and assuming letters = [('a', 10), ('b', 20)], we can write:

 pure (\x y -> x+y) <*> lookup 'a' letters <*> lookup 'b' letters

and get Just 30 as a result. Try changing one of the keys to something else and see what happens. To summarize, we wanted to hide the details of error handling, and the above is fairly successful at it—though the passing of values down the pipeline is still missing.

One technical point: the type checker will accept a definition for an instance as long as the types match, but it’s recommended to ensure that your definition has some other properties or “laws.” The “library docs” explain the laws expected. These ensure that <*> mirrors the associativity property of function composition and that pure works with <*> as a kind of identity element. Recall that associativity means f . (g . h) is the same as (f . g) . h, i.e. that it doesn’t matter which way you do the computation. Do these laws matter? Well, the monad police will not come knocking, not yet anyway, but checking the laws hold will give you confidence that the code will behave as you expect.

(Compare: would you freak out if (1 + 2) + 3 turned out to be different from 1 + (2 + 3)? Or if 1 == 2 differed from 2 == 1?)

A Monad for Maybe

We tried (£) :: Maybe a -> Maybe (a -> b) -> Maybe b, so let’s see what happens when the second Maybe gets pushed inside the function, to give this type:

 (££) :: Maybe a -> (a -> Maybe b) -> Maybe b

Guided by the type, we can start filling in the code. Line 1 is the case where the first argument is Nothing, and if we don’t have a value to use then we have Nothing to return. Line 2 is the other case, where we unpack an x and then feed it to the second argument—and this second argument produces the Maybe b we want.

 Nothing ££ _ = Nothing -- line 1
 Just x ££ k = k x -- line 2

Again, compare this definition against that for the applicative functor and functor. The change is small but what it adds is significant: the second argument can now use the result of the first Maybe to decide what to do. This is a new dependency of the second argument on the first that we’ve not seen so far, and it enables the passing of values down the pipeline.

Consider how it works in the context of our first example. I’ve changed the layout to emphasize the imperative aspects, but it’s still the same code. The first line can be read as “perform step 1 and get the result under name res_1.” If step 1 returns Nothing, then the computation stops and returns Nothing overall. Otherwise, the result is fed into the second argument function which makes the result available as res_1 and the same idea applies again: all stop if step 2 fails, else continue with the result as res_2. Finally, the last step can use res_1 and res_2 to compute the final result.

 do_step_1 ££ \res_1 ->
 do_step_2 res_1 ££ \res_2 ->
 use_vals (res_1,res_2)

Note that use_vals will return a Maybe value, but we don’t care here whether it returns Nothing or not—it’s not relevant to the code above. The caller of this code can decode it if it needs to.

So we’ve reached the goal and found an operator for glueing together two Maybe values so that the error handling is hidden, and we can pass values from one stage to successive stages. This plumbing, effectively, is a monad. Wasn’t so bad, was it?

You probably realize that this kind of plumbing can apply to other types too. Haskell overloads the plumbing via another constructor class. Here are the key details:

 -- from the Prelude (via library Control.Monad)
 class Monad m where
  return :: a -> m a
  (>>=) :: m a -> (a -> m b) -> m b
 instance Monad Maybe where
  return = Just
  Nothing >>= _ = Nothing
  Just x >>= k = k x

Like for pure, the overloaded return function just allows a value to be turned into a stage in the pipeline, particularly in a way that allows the following stage to retrieve it. There are also a few laws to check, basically corresponding to a notion of associativity and two checks that return behaves like an identity element. So, a monad is anything for which you can invent a suitable definition of >>= and return! However, some monads are more useful than others...

A More Convenient Notation

Haskell has some syntactic sugar for writing down pipelines based on monads, called “do notation.” Instead of combinations of explicit lambdas and >>= operators, we can write this:

 do res_1 <- do_step_1
  res_2 <- do_step_2 res_1
  use_vals (res_1,res_2)

do is a keyword, and must be followed by a block of statements. Each statement is either a call to some (monadic) operation (aka “action”) or has form pattern <- action, which represents running the action and capturing its output by matching against the pattern. The simplest pattern is a variable name, as used above to capture the first and second results.

This “do” notation is nothing magical—it is just a shorthand that is translated into some combination of >>= and lambdas early in compilation. But, I hope you agree, it does make the code easier to read and follow. It’s no accident that it looks like imperative code either—we are (for most monads) describing some sequential process anyway, e.g. do step one first, if all OK then do step two, etc.

Being able to use this notation also helps understand how to use monads. Firstly, monads are the things that make this notation work, and secondly, we are using monads in order to program in this style—so once you know what kind of operations you want to do (and so what kinds of type will be involved), then you can start writing more abstract code in do notation and rely on monads to handle the plumbing.

Programming with the Interfaces

One of the many advantages of programming to interfaces is that you can write code that works with any instance of the interface. One important operation for monads is sequence :: Monad m => [m a] -> m [a], which is used to convert a list of monadic values to a list of values, by chaining them together (with >>=) and collecting a list of all of the results generated.

 sequence [] = return []
 sequence (m:ms) = do x <- m
  xs <- sequence ms
  return (x:xs)

You could write it as a fold (try it?) but the above version is also a good example of the programming style with monads. The first line says, if no actions to perform then return [] (i.e. so that the caller can retrieve the [] value). Otherwise, perform the first action m to get a value x, then call sequence recursively to process the remaining actions to give the values xs, and finally we return the combined list of values.

Here’s an example:

 sequence $ map (\k -&gt; lookup k letters) ['a','a','b','a']

will try to look up all of the characters in the list, and return a list of results Just [10,10,20,10] because all of the lookups succeeded. However,

 sequence $ map (\k -&gt; lookup k letters) ['a','c','b','a']

will return Nothing because the monad for Maybe bombs out at the first fail. Compare the result to map (\k -> lookup k letters) ['a','c','b','a']: the key difference is that sequence is computing the overall result from the lookups in sequence, and the monad definition dictates a fail when any of the stages fails.

The libraries contain other imperative-style control structures too, like while loops, and it’s easy enough to add our own if nothing fits. Note also how the monadic code fits alongside other bits of Haskell, i.e. we could use map with a monadic action over a list of values, to get a list of monadic values, then use sequence to compute the overall result. In other words, the monadic values are just like any other piece of data in Haskell, and no extension to the language is needed.

A Quick Look at State

Haskell has no notion of mutable state (i.e. variables that you can assign new values to), but some algorithms make essential use of updateable state. You probably realize that many algorithms that look like they need state can actually be rephrased as simpler data manipulations, such as how breadth-first search was done last month, but there are some algorithms and cases where this doesn’t work well, such as more complex search on graphs where you need an explicit representation of the graph explored so far in order not to duplicate work.

What to do?

The simplest approach mirrors how thread-safe libraries work: we carry a thread’s state value around explicitly and safe operations will work on that state value rather than some global state—hence the state must be one of the parameters. In FP terms, we can use the following function type for state-manipulating entities:

 type StateChg s a = s -> (s,a)

That is, a state-changing operation can be represented as a function that takes a state and returns a (possibly new) state plus some other “visible” non-state value. Here are some simple examples of state operations in this style.

 add_num :: Int -> StateChg Int ()
 add_num val = \total -> (total + val, ())
 fetch_state :: StateChg s s
 fetch_state = \s -> (s,s)
 zero_state :: StateChg Int ()
 zero_state = \total -> (0, ())

add_num adds a value to whatever state value is passed in, so it could be used to add a number to a running total. Alongside the new state value, the empty tuple or dummy value () is returned as the “visible” value—it’s a common Haskell idiom to indicate lack of an interesting value, kind of like void in C, and reinforces the idea that the operation is being done for its effect on state (i.e. side-effect) rather than to compute a value.

fetch_state provides a way to consult the current state, hence it returns the incoming state value as the “visible” value, as well as passing on the unchanged state. Notice that it’s totally polymorphic—it doesn’t depend on the type of the state.

zero_state provides a way to reset the (Int) state back to zero, and does so by ignoring the incoming state and passing on 0 instead. The visible value is () in keeping with the “just for side-effects” nature of this operation.

Now, it’s going to be boring unless we can chain a few state manipulations together in a pipeline. We’d like to write code like the following, on the left-hand side. The more familiar conventional mutable-variable version appears alongside as comments.

 do zero_state -- x = 0
  add_num 3 -- x += 3
  t <- fetch_state -- t = x
  add_num (t * 2) -- x = x + t * 2

Notice how the monadic do-notation is giving us a DSL for state-manipulating computations, and it’s not too awkward to use.

We can also define more flexible operators, e.g. change_state :: (s -> s) -> StateChg s () to apply an arbitrary change to the state value, hence replace add_num 3 with change_state (+3). Multiple “variables” are possible too, e.g. replacing the single state value with an “environment” (mapping names to values), and adjusting the state operations to fetch and change entries in the environment according to names.

In fact, we can introduce whatever abstractions we want, anything that will help to make the code clearer.

Let’s consider how the monadic >>= operator works here, i.e. the glue or plumbing that is used to combine two smaller stages together in the pipeline to create a larger stage. Each stage takes a state and produces a new state plus a value, so part of the work is to thread the state from the first stage to the second. We also have the second stage depending on the output of the first (which allows passing values down the pipeline). The following is a sketch of the code (a few technical details, like newtype and class instances have been left out).

 (>>=) :: StateChg s a -> (a -> StateChg s b) -> StateChg s b
 step1 >>= step2_fn = \s_in -> let (s_mid, res1) = step1 s_in
  step2 = step2_fn res1
  in step2 s_mid

The “monad” part in the type signature is StateChg s, i.e. a StateChg with the same (polymorphic) state type throughout. We don’t want the state type to change during the pipeline! Otherwise, the signature has the usual m a -> (a -> m b) -> m b pattern. The result of >>= must be another state transformation, so we use \s_in -> ... to capture the expected incoming state and then compute the result from this state. The let x = y in z notation allows us to give local names to intermediate results, particularly to show the plumbing in full detail. We get the intermediate state s_mid and first value res1 by passing the incoming state to the first step. We get step2 from the second argument step2_fn by passing it res1 (remember: the second argument isn’t a state change itself, but a function that determines the actual state change when applied to res1). Finally, the overall result is whatever comes out of step2 when it is run on the intermediate state s_mid.

I was careful to avoid presenting this in imperative terms, i.e. “we get X from Y and ...” rather than “do Y to get X then ...,” because Haskell doesn’t necessarily evaluate them in that sequence. I should have made this clearer earlier: Haskell does not prescribe a particular order, like Java does. (Incidentally, do you know what the evaluation order is for function arguments in C? If you don’t, you might be surprised.)

Instead, Haskell’s evaluation mechanism is guided by dependency: at each stage it only does enough work to decide which clause of a function to use, e.g. to select the empty list case vs the non-empty list case, and no more—so it need not touch arguments that are irrelevant to the immediate decision. Apart from that, the compiler is free to choose which order to do work. For example, in (a + b) + (c + d) which operand is computed first? The language doesn’t fix this, and the lack of side effects also means that it doesn’t actually matter (and the compiler doesn’t have to attempt to infer this). Indeed, the compiler can even do both in parallel!

With monads, however, we are allowing sequential behavior and side effects in a very controlled way, and the pattern of dependencies inside >>= gives the expected sequential progression. This gives us precise control over ordering of state changes so that they do happen in the right order, e.g. updating a variable before reading its value.

You might be concerned that this extra structure makes the code slow. This is partly true, since there is more work to do, but do you remember the silver rule from the first article? “Trust your compiler!” Haskell compilers (GHC especially) are able to inline and transform out some of the overheads of monadic code. So, don’t worry about it until it’s shown to be a bottleneck.

IO Is a Kind of State Manipulation

Space is getting short, so I need to be brief now. Plus, enough has been written elsewhere on the topic! I particularly recommend “Tackling the Awkward Squad” by Simon Peyton Jones as a starting point.

The key points are:

  • IO is a key case where we need precise control of the order of events.

  • The monadic concepts and framework provide exactly the kind of DSL we want for IO.

  • The IO type is usually defined as a compiler primitive, but internally it is basically a state manipulation, like type IO a = RealWorld -> (RealWorld, a).

  • IO actions are effectively changes in the outside world, and the monad structure ensures the changes to the RealWorld occur in the right order. Plus, the compiler can optimize this state transformation away to leave a sequence of calls to underlying OS libraries.

  • Haskell libraries provide various IO operations like putStr :: String -> IO () and getLine :: IO String that can be chained together using do notation etc. putStr prints a string, and getLine reads a line of input, e.g. do { i <- getLine; putStr ("Line is:" ++ show i) } to read a line and then say something about it.

  • The libraries also contain full support for file and network IO, exceptions, pointer-like variables....

  • It’s also a good basis for concurrency and threads.

One subtle point that catches some people out regarding how monadic values work: What is this line of code going to do?

 [ putStr "hello", putStr "world",
  do {system "echo bang"; return ()}

Answer: nothing! It is just a list of data, where each element is some IO action, i.e. with type [IO ()].

Such actions can be passed around like normal values, because that’s exactly what they are—just pieces of data. The actual effects won’t actually occur until such data is threaded into the sequence of actions being run by the top level of a program. Having such actions as data also leads to nice ways of using callbacks, listeners, etc. For example, GHCI’s REPL takes some input, parses it, type-checks it, then attempts to convert the value to a string and shows it on screen with putStr. However, when it detects a value of type IO a, it runs the action instead of printing something. Can we convert the list of actions to a single action? Remember sequence? The following causes the list of actions to be run, and to be run in the right order too.

 sequence [ putStr "hello", putStr "world",
  do {system "echo bang"; return ()}

More Monads, and Guidelines

We’ve seen some simple monads, but potentially any type constructor can be a monad if you can find definitions of return and (>>=) that meet the requirements. One interesting group comes from combinations of simpler monads, which results in monads that combine the behaviors of their components. Their plumbing is a mixture of the simpler bits of plumbing. The blogs are full of interesting (and some scary) combinations: see what you can find. Many of these have serious practical uses too; for example, several parser libraries use a combination of state, multiple values (“non-determinism”) to present a simpler algorithm.

However, don’t go mad.


A good guideline: only use monads where they are essential, such as the key parts of state-based algorithms or the outer IO-bound levels of a program.

You could write all of your programs in a monadic style, but this would get tedious quite quickly, obscure key details about data transformations, and force execution of your code in a particular order (and limit what the compiler can do with it). Haskellers often refer to monadic code as a “sin bin”—tricky code is put there so we can keep an eye on it, not to reward it.

With a bit of thought about the data involved, you can often re-cast some imperative-looking operation to a simpler functional transform, maybe with a bit of monad on the outside. For example, if you’re wanting to process some numbers in a file but wanting to catch invalid entries, you could write monadic code that reaches down through handling of lines and cells on lines to each individual number, but this could be overkill. It could be simpler to read the file as a string, split it into tokens, then map some operation that parses the numbers to return a Maybe Int result, etc., finally having some outer operation that uses sequence on the Maybe values to determine if the whole sequence is valid. (For detailed error handling, you could switch to the similar Either type.)

Also consider whether the intermediate option of applicative functors gives you enough functionality.

Dr Paul Callaghan has temporarily run out of motorbike metaphors. He hopes that there’s enough of the wider ideas in here to help you use monads appropriately in your code. Bits of his bio can be seen on earlier articles. Paul also flies big traction kites and can often be seen being dragged around inelegantly on the beaches of North-east England, much to the amusement of his kids. He blogs at and tweets as @paulcc_two.

Send the author your feedback or discuss the article in the magazine forum.