Paul continues his deep dive into the Haskell language and functional programming with a look at Web frameworks for Haskell.
Last issue, Paul began an exploration of Web programming in the Haskell language with Fay, a CoffeeScript analog for Haskell. This time out, he begins the journey into Haskell frameworks.
Rails Is Not Your Application
Before we jump into concrete details of existing Haskell web frameworks, I want to pause to consider the wider picture of how web apps can be structured and developed.
Many frameworks, like Rails, are some variation on MVC and share certain standard components. Are you happy using these? or do you have some sneaking suspicion that it can still be easier or simpler? I’m sure we can do better.
There are several motivations for wanting to explore such issues. Firstly, I think it’s fair to say there is some dissatisfaction in various communities about the maintainability of code, as reflected by the number of talks in recent conferences about different architectural approaches like hexagons and DCI. Secondly, a slight annoyance at needing to do things like check that my routes file matches my controllers etc., and having too many framework details mixed up with business logic. Thirdly, I want to see where taking a top-down functional approach gets me. It may deliver me exactly where newer frameworks are heading, but it should still be an interesting exercise and promote deeper understanding of the issues.
An important slogan for me is “Rails is not your application.” This phrase comes from Nick Henry in the middle of 2011. It began as a tweet, as many great pieces of wisdom do, and subsequently has been expanded into a blog post—and taken up by other people.
There’s no strong consensus on answers yet—we’re still exploring—though it seems clear that more separation is needed between the facilities provided by Rails and the real business logic of the app. For example, domain models shouldn’t always be subclasses of Active Record and could instead be users of services provided by Rails components. Some interesting further discussion can be found on Nick Henry’s blog page.
From a strategic aspect, I’ll be asking whether we can streamline the building of web apps, leveraging the language facilities to reduce or remove certain of the weak points in current practices.
For example, I’ve never been keen on the separation of routes from controllers: it feels like having related information distributed over different files, with the ever-present risk of getting the two out of sync, or else having to ensure that your tests are strong enough to catch any mismatches—and I don’t accept that it has to be this way. Phenomena like this represent possible failure points, and it’s a good technique in programming to try to eliminate or control such potential failures. If you want a simpler example, consider looping through a list to compute some result—do you write an explicit for-loop or do you instead use a higher order function? The latter has fewer moving parts....
In general, we’ll often ask the question: what do we actually need, and can we express it in a straightforward and safe way? There are also some elements of BDD-ish thinking here too: for example, we’ll be asking “What’s the simplest thing that would work,” without committing ourselves to an existing framework.
Finally, don’t expect too many answers in this article. My main goal is to ask the questions and see what your response is!
Towards a Functional Approach
We’re going to put data first, over process or tradition, and see where it leads us.
Somehow, the app on server will decode the request into something more structured, run some code, and then return a piece of text. Typewise, we can start thinking of String -> String for the type of the transaction, or being a bit more precise or informative with Request -> Response, where Request represents a path and parameters with the relevant Http verb (eg GET, DELETE etc.), and Response can contain various bits of meta-information like mime type in addition to the raw response.
Html is (or is generally intended to be) stateless, and this is reflected well in the simple type above.
However, session information can sometimes muddy the water and make the requests context dependent. For example, it could depend on whether someone is logged in or not. There will also be real-world side effects: doing a DELETE of some resource will affect the outcome of any GET of the same later. A simple way to encode this context dependency is to refine our type to (Sessionable server, Monad server) => Request -> server Response, that is, explicitly requiring the transaction (well, the generation of some response given a request) to be a monadic action and following the usual monad rules and conventions, plus supporting various session operations.
(Note that I’m not committing to a particular monad instance yet—just in vague terms signalling that I intend to use the wider general monad api or framework.)
We should also consider the component that sits between the screen and the chair, the not always perfect user. What’s their type? Ideally they would be Response -> Request, turning server responses into new requests, by processing the information returned and then deciding what to do about it. We should allow some more leeway, though, and let them be monadic too, since side-effects may influence how they act. So let’s give them a potential type of Monad b => Response -> b Request. There may be some unpredictable elements of user behavior too, but we can hide those away in a monad as well(!).
The First Big Question
From the outside, we can conceive of the complete web app as having a type of (Sessionable a, Monad a) => Request -> a Response. What’s inside this monadic function? That is, how does the web-oriented machinery link up with the underlying business app?
Following the “Rails is not your application” idea, we can ask a very important question: can the business application be entirely separated from the web details? I’m going to assume for now that the answer is yes. Does anyone think this is controversial?
It may be that we find aspects of the business app that are hard to divorce from its web presentation, but this for me is a prompt to look for new ways to structure those aspects to reduce the coupling.
Anyway, I’m going to assume that the two are separable, and furthermore that we can structure the web side of the overall application as a wrapper around the core business app.
This begs the question, what does the underlying non-web app look like? I would hope to see data types corresponding to domain concepts, and code that allows business-oriented processing of elements of those types. We could use the code in a REPL to perform certain tasks, as for the running example, to create a list for a user and manipulate the entries in the list.
What’s missing? Clearly it’s not convenient to use, so we would like to add some structure to the interaction side, and eventually to make it look nice and be pleasant to use. The first aspect is fairly independent of whether we’re aiming eventually at a web app, in the sense of it being useful for a stand-alone desktop app as well. The second part is clearly more in the web domain.
Let’s get this out of the way early. It is a massive topic, where a lot of work has already been done, though one which we should try to disentangle from the business logic. Web apps typically need some kind of persistent storage, firstly to store data between processing of discrete requests, and secondly because the app will typically run as a distributed system on multiple physical servers. But what does the business app actually need? What’s the simplest thing that would work?
Mainly, what we want is a collection of ways to create new data, retrieve it, and modify or delete it. Most of the time, it doesn’t matter what is handling these operations, so that’s a dependency we should try to abstract out. We’ll also need a way to ensure that destructive updates are done in the right order, so monads might be a useful thing to use here too.
We might try something like this: a type class Storage that provides basic storage facilities for data that is serializable (or mappable to some database representation), using globally unique identifiers (GUIDs) to reference particular data. The “store” operation might also want to check for validity of the data before saving it, though an alternative way is to design the business types so that invalid data is excluded from the types (simple example: that 123 would be rejected as a String). A sub-class could provide more powerful search facilities, taking some collection of predicates on some type and returning a list of GUIDs of the objects found. (Note that GUID type has one parameter: this will help to distinguish GUIDs for one type from those of another type, and so help to prevent a GUID for one record type being used to retrieve another. In contrast, most frameworks just use unprotected integers for their GUIDs.)
I envisage these storage facilities being used outside of the core business app, thus being used to store data that is computed by core functions, or to retrieve data that is used in core functions. Until proved otherwise, I’d like to keep this separation strict, and hence keep the business logic as simple as possible.
Identifying the Interactions
Let’s return to the question of how to convert the types and functions on the domain objects into a recognizable application, by considering what extra we have to add to make an application. One approach is to recognize that we’re wanting to construct various user stories and common tasks out of the basic domain operations. For example, allowing a user to create a new list with a name, and to have it saved if valid.
Consider though, that there are certain stories we want to support, and some combination of actions that we do not. You might be thinking that we can use tests as a way to start distinguishing what is allowed and what isn’t, but we can be more functional here. What about representing this in types?
One possibility is to encode allowed actions as a set of values in some type, for example the simple set of actions:
Remember a key point of (modern) FP: that types are cheap—easy to create and use—so if it helps our programming to introduce a type, it’s a technique well worth considering.
Another thought is that the application interface, or the interaction that the user has with the application, is not just a random collection of stuff: there is some kind of structure there, for example indicating what is allowed and what isn’t, and we should look to exploit this structure rather than let it be implicit in the documentation or in the tests. (Notice: it’s another example of trying to use the language to lift the level of the programming, turning the implicit into explicit and so to reduce potential failure points.)
You might also consider the interaction as being carried out in some kind of “formal language” or structured game, where some sentences or moves are ok and others are invalid or highly inappropriate(!).
Once we’ve managed to encode the interaction in some concrete way, the next step is to write an “interpreter” for it that will turn the user’s requests into operations on the business concepts, that is, it will “run” the requests.
You can probably have a similar setup on the other side, namely a set of interactions that the system will need to have with the user. For example, reporting successful creation of a named list, or showing a list of results after some search query. Note also that several types will be needed to represent the interactions, such as representing certain modalities or contexts, like certain operations only being valid in certain states. You might consider it as a hierarchical structure of interactions, like the top level of the tree allowing the “big” operations like creating or selecting a list, with a structure under the latter indicating what can be done once the list has been selected.
To recap, since this might be a strange idea for some and it’s a key part in what I’m suggesting: we should attempt to capture what kind of interactions or operations the user can have with the application by representing them via various mechanisms in the language, such as new datatypes, and to link these representations to the underlying business logic via techniques like interpretation. The point is to encode structure where we can, with a view to exploiting it for clear code and to avoid failures.
Navigation and Routing
By navigation, I mean meaningful moves around the application’s interface, typically from a certain context to relevant or useful functionality. In our running example, like from the display of a given list to either back to viewing all lists or to being able to edit the list, or even to deleting it. Effectively, this means identifying which interface actions are relevant or useful to the current action, and this is easily represented in code.
This brings us nicely to the issue of routing. Traditional frameworks identify various formats of Http path and map them to calls of certain code, usually methods inside certain controllers. There are various potential points of failure here, such as the routes spec not matching up with the controllers, or mismatches between parameters extracted from the route versus those expected in the controller. It’s a bit boring having to test this if we can do better.
What do we actually need, though? Basically, we want to map a string from an Http request into an invocation of certain code. But where do these strings come from? Typically, they come from the routes spec being used in reverse, where some view or controller code references a high-level name for a certain piece of functionality. This does the job, mostly, of ensuring that the relevant functionality is called.
How about a different idea: it doesn’t really matter what string is generated, as long as it gets resolved to the right call in the end! That is, the path string just needs to contain enough information to ensure the relevant interface action is executed, and nothing else. (We could also encrypt whatever representation we use, if we want extra confidence.) What information do we need to include? One possibility is just some encoding of the actual interface operation the link represents, like some element of the UserAction type above.
We probably also need some context information too; for example, if working on a selected list then we need to know which list. Here’s a very useful FP concept that works well here: the Zipper. It’s an idea due to Gerard Huet that has been around for most of my lifetime. It helps with traversal and editing of functional data structures by representing a kind of cursor for where one is in a data structure, together with information for reconstructing the value after traversal. As a simple example, if one has traversed a binary tree from the root node by following left, right, right, then the zipper structure for the tree will contain the current subtree plus information for reconstructing the tree already traversed. The zipper concept can be applied to many types, demonstrated by my great friend Dr Conor McBride as the operation of taking the derivative of a datatype (yes: the concept from calculus). See this early draft for a good introduction.
So, the suggestion is for each “path” in the system to be some encoding of the corresponding “state” of the interface. No separate routes file required!
Controllers and Views
If we can (it seems) dispense with routes, then what happens with controllers? Mostly, they will be subsumed into the interpreters for the interface actions, and so play a lesser role than in current frameworks. I would hope that much of the code seen in controllers—particularly the boilerplate code for decoding parameters, updating models and checking the results of saves—will mostly be abstracted away, possibly replaced by calls to suitably overloaded library code.
We still need views, though. These will be the main way of rendering domain concepts into Html or related formats, though their use might change. I would like to see most of the rendering work done with visibly pure functions (obviously no side effects, so 100% cosmetic), preferably called through a single overloaded function. That is, all viewable things should implement the member functions of an interface like the following for standard rendering of their data, analogously to the Show class for rendering to strings.
(The Html type will be provided by one of Haskell’s several libraries for Html, such as Blaze. See last month’s article for some examples of the cool features of Blaze.)
Sometimes more control is needed, for example rendering of some thing might be needed in two different contexts, with different rules. There are several options, including a subclass that takes additional context information, like this:
or we could wrap up our basic values into more complex types and get finer control via the more complex type’s instance definition, for (a contrived) example, to prefix the rendering of a value wrapped with strong tags with a number of stars, where we would call
to get 3 stars in front of
Similarly, we can control rendering of collections by wrapping them in a different type, where the type’s instance applies the relevant rendering of the container.
Notice: we’re using types again (as wrappers) to represent the concepts we want, rather than passing around some ad hoc flags, and this can help to ensure consistency and avoid silly mistakes. As a final example, we can use a similar mechanism to define table formats, perhaps storing a list of values together with a list of (column title, extraction function) pairs that control how the columns are rendered. The extraction functions could themselves return Htmlable values so that relevant rendering of links or plain data can be handled.
Forms and Updates
This is one of the parts of Rails that make me nervous, particularly when a form should only update part of an object. All that hassle with protected fields, nested attributes, etc.
In light of the previous discussion, my suggestion is to capture the relevant user story as part of the app’s interface description (that is, as a type), and generate the form and the handling code from this. So in the context of viewing a certain list, if (say) we wanted to edit its title and date, then the interface description would contain this as a possible step for a given list:
And the interpreter for this would (depending on context) either yield a form for changing those fields, or perform the relevant update. As with routes, it may be possible to avoid the usual string encoding of fields of nested attributes and use some representation that only the form generator and results receiver need know about. (To recap: the key requirement is that the coding used by the form achieves the desired effect when the form submission is processed. The rest, like human readability, is optional, so we could use quite interesting coding schemes.)
Forms for nested data could be handled by including structured data in the interface representation and adjusting the interpreter, etc.; say the following, which could allow the collection of items attached to the current to-do list to be edited, for example only keeping ones that are ticked.
In summary, the interface description should explain precisely what should be changing for a given step, from which certain code and html can be generated exactly. Hopefully, this removes a few more potential failure points and reduces reliance on tests.
The above may have been a bit of a ramble, but hopefully you saw one or two interesting things along the way. To be honest, I wasn’t quite sure when I started where I would end up, but am pretty pleased with the result and it seems to be worth further thought and experimentation.
The key idea here is to capture the possible user actions more clearly as code, in particular here as (hierarchical) types whose elements indicate what users are allowed to do and what kind of things the system will return. In Rails apps, this information is almost always buried under other details, and has to be teased out with a range of tests (with varying degrees of success). I expect other mainstream frameworks have the same issues. So my suggestion is to make these key details more visible and to get the details to work for the programmer, for example helping them to construct the code that provides the functionality. A side-benefit is to identify Huet’s zipper concept as a powerful way to represent a user’s journey through the structure of an application’s interface.
There’s also a corollary to the story, with several hints as to where and how further use of appropriate data types can help to clean up and simplify an app. A key point of these articles is about using data types more fully, and about how modern functional languages make it easier to create and use new datatypes in the service of simpler programs. So the final word is, instead of going for more complex processes, try to tease out the structure of what you are doing and see if any parts of it can be represented in a simpler form as data. It might just work.
Dr Paul Callaghan finds map reading and motorbike riding a little difficult to combine. Instead of frequent stops, he likes to carry on riding, randomly taking interesting-looking roads just to see where he ends up. It’s a good thing to try intellectually as well, and warmly recommended: you never know what you might find. Bits of his bio can be seen on earlier articles. Paul also flies big traction kites and can often be seen being dragged around inelegantly on the beaches of North-east England, much to the amusement of his kids. He blogs at free-variable.org and tweets as @paulcc_two.