Extensible, Type-Safe Error Handling In Haskell

A colleague of mine introduced me to the benefits of error-chain, a crate which aims to implement “consistent error handling” for Rust. I found the overall design pretty convincing, and in his use case, the crate really makes its error handling clearer and flexible. I knew pijul uses error-chain to, but I never had the occasion to dig more into it.

At the same time, I have read quite a lot about extensible effects in Functional Programming, for an academic article I have submitted to Formal Methods 20181. In particular, the freer package provides a very nice API to define monadic functions which may use well-identified effects. For instance, we can imagine that Console identifies the functions which may print to and read from the standard output. A function askPassword which displays a prompt and get the user password would have this type signature:

Compared to IO, Eff allows for meaningful type signatures. It becomes easier to reason about function composition, and you know that a given function which lacks a given effect in its type signature will not be able to use them. As a predictable drawback, Eff can become burdensome to use.

Basically, when my colleague showed me its Rust project and how he was using error-chain, the question popped out. Can we use an approach similar to Eff to implement a Haskell-flavoured error-chain?

Spoiler alert: the answer is yes. In this post, I will dive into the resulting API, leaving for another time the details of the underlying implementation. Believe me, there is plenty to say. If you want to have a look already, the current implementation can be found on GitHub.

In this article, I will use several “advanced” GHC pragmas. I will not explain each of them, but I will try to give some pointers for the reader who wants to learn more.

State of the Art

This is not an academic publication, and my goal was primarily to explore the arcane of the Haskell type system, so I might have skipped the proper study of the state of the art. That being said, I have written programs in Rust and Haskell before.

Starting Point

In Rust, Result<T, E> is the counterpart of Either E T in Haskell2. You can use it to model to wrap either the result of a function (T) or an error encountered during this computation (E). Both Either and Result are used in order to achieve the same end, that is writing functions which might fail.

On the one hand, Either E is a monad. It works exactly as Maybe (returning an error acts as a shortcut for the rest of the function), but gives you the ability to specify why the function has failed. To deal with effects, the mtl package provides EitherT, a transformer version of Either to be used in a monad stack.

On the other hand, the Rust language provides the ? syntactic sugar, to achieve the same thing. That is, both languages provide you the means to write potentially failing functions without the need to care locally about failure. If your function B uses a function A which might fail, and want to fail yourself if A fails, it becomes trivial.

Out of the box, neither EitherT nor Result is extensible. The functions must use the exact same E, or errors must be converted manually.

Handling Errors in Rust

Rust and the error-chain crate provide several means to overcome this limitation. In particular, it has the Into and From traits to ease the conversion from one error to another. Among other things, the error-chain crate provides a macro to easily define a wrapper around many errors types, basically your own and the one defined by the crates you are using.

I see several drawbacks to this approach. First, it is extensible if you take the time to modify the wrapper type each time you want to consider a new error type. Second, either you can either use one error type or every error type.

However, the error-chain package provides a way to solve a very annoying limitation of Result and Either. When you “catch” an error, after a given function returns its result, it can be hard to determine from where the error is coming from. Imagine you are parsing a very complicated source file, and the error you get is SyntaxError with no additional context. How would you feel?

error-chain solves this by providing an API to construct a chain of errors, rather than a single value.

The chain_err function makes it easier to replace a given error in its context, leading to be able to write more meaningful error messages for instance.

The ResultT Monad

The ResultT is an attempt to bring together the extensible power of Eff and the chaining of errors of chain_err. I will admit that, for the latter, the current implementation of ResultT is probably less powerful, but to be honest I mostly cared about the “extensible” thing, so it is not very surprising.

This monad is not an alternative to neither Monad Stacks a la mtl nor to the Eff monad. In its current state, it aims to be a more powerful and flexible version of EitherT.

Parameters

As often in Haskell, the ResultT monad can be parameterised in several ways.

  • msg is the type of messages you can stack to provide more context to error handling
  • err is a row of errors3, it basically describes the set of errors you will eventually have to handle
  • m is the underlying monad stack of your application, knowing that ResultT is not intended to be stacked itself
  • a is the expected type of the computation result

achieve and abort

The two main monadic operations which comes with ResultT are achieve and abort. The former allows for building the context, by stacking so-called messages which describe what you want to do. The latter allows for bailing on a computation and explaining why.

achieve should be used for do blocks. You can use <?> to attach a contextual message to a given computation.

The type signature of abort is also interesting, because it introduces the Contains typeclass (e.g., it is equivalent to Member for Eff).

This reads as follows: “you can abort with an error of type e if and only if the row of errors err contains the type e.”

For instance, imagine we have an error type FileError to describe filesystem-related errors. Then, we can imagine the following function:

We could leverage this function in a given project, for instance to read its configuration files (for the sake of the example, it has several configuration files). This function can use its own type to describe ill-formed description (ConfigurationError).

To avoid repeating Contains when the row of errors needs to contains several elements, we introduce :<4 (read subset or equal):

You might see, now, why I say ResultT is extensible. You can use two functions with totally unrelated errors, as long as the caller advertises that with Contains or :<.

Recovering by Handling Errors

Monads are traps, you can only escape them by playing with their rules. ResultT comes with runResultT.

This might be surprising: we can only escape out from the ResultT if we do not use any errors at all. In fact, ResultT forces us to handle errors before calling runResultT.

ResultT provides several functions prefixed by recover. Their type signatures can be a little confusing, so we will dive into the simpler one:

recover allows for removing an error type from the row of errors, To do that, it requires to provide an error handler to determine what to do with the error raised during the computation and the stack of messages at that time. Using recover, a function may use more errors than advertised in its type signature, but we know by construction that in such a case, it handles these errors so that it is transparent for the function user. The type of the handler is e -> [msg] -> ResultT msg err m a, which means the handler can raise errors if required. recoverWhile msg is basically a synonym for achieve msg $ recover. recoverMany allows for doing the same with a row of errors, by providing as many functions as required. Finally, recoverManyWith simplifies recoverMany: you can provide only one function tied to a given typeclass, on the condition that the handling errors implement this typeclass.

Using recover and its siblings often requires to help a bit the Haskell type system, especially if we use lambdas to define the error handlers. Doing that is usually achieved with the Proxy a dataype (where a is a phantom type). I would rather use the TypeApplications5 pragma.

The DecriptiveError typeclass can be seen as a dedicated Show, to give textual representation of errors. It is inspired by the macros of error_chain.

We can start from an empty row of errors, and allows ourselves to use more errors thanks to the recover* functions.

cat in Haskell using ResultT

ResultT only cares about error handling. The rest of the work is up to the underlying monad m. That being said, nothing forbids us to provide fine-grained API for, e.g. Filesystem-related functions. From an error handling perspective, the functions provided by Prelude (the standard library of Haskell) are pretty poor, and the documentation is not really precise regarding the kind of error we can encounter while using it.

In this section, I will show you how we can leverage ResultT to (i) define an error-centric API for basic file management functions and (ii) use this API to implement a cat-like program which read a file and print its content in the standard output.

(A Lot Of) Error Types

We could have one sum type to describe in the same place all the errors we can find, and later use the pattern matching feature of Haskell to determine which one has been raised. The thing is, this is already the job done by the row of errors of ResultT. Besides, this means that we could raise an error for being not able to write something into a file in a function which opens a file.

Because ResultT is intended to be extensible, we should rather define several types, so we can have a fine-grained row of errors. Of course, too many types will become burdensome, so this is yet another time where we need to find the right balance.

To be honest, this is a bit too much for the real life, but we are in a blog post here, so we should embrace the potential of ResultT.

Filesystem API

By reading the System.IO documentation, we can infer what our functions type signatures should look like. I will not discuss their actual implementation in this article, as this requires me to explain how `IO` deals with errors itself (and this article is already long enough to my taste). You can have a look at this gist if you are interested.

Implementing cat

We can use the ResultT monad, its monadic operations and our functions to deal with the file system in order to implement a cat-like program. I tried to comment on the implementation to make it easier to follow.

The type system of cat teaches us that this function handles any error it might encounter. This means we can use it anywhere we want… in another computation inside ResultT which might raise errors completely unrelated to the file system, for instance. Or! We can use it with runResultT, escaping the ResultT monad (only to fall into the IO monad, but this is another story).

Conclusion

For once, I wanted to write about the result of a project, instead of how it is implemented. Rest assured, I do not want to skip the latter. I need to clean up a bit the code before bragging about it.


  1. If the odds are in my favour, I will have plenty of occasions to write more about this topic.

  2. I wonder if they deliberately choose to swap the two type arguments.

  3. You might have notice err is of kind [*]. To write such a thing, you will need the DataKinds GHC pragmas.

  4. If you are confused by :<, it is probably because you were not aware of the TypeOperators before. Maybe it was for the best. :D

  5. The TypeApplications pragmas is probably one of my favourites. When I use it, it feels almost like if I were writing some Gallina.