Exceptions vs. Return Values to represent errors (in F#) – III–The Critical monad

Code for this post is here.

In the last post we looked at some Critical code and decided that, albeit correct, it is convoluted. The error management path obfuscates the underlying logic. Also we have no way of knowing if a developer had thought about the error path or not when invoking a function.

Let’s tackle the latter concern first as it is easier. We want the developer to declaratively tag each method call with something that represents his intent about managing the Contingencies or Faults of the function.  Moreover if the function has contingencies, we want to force the developer to manage them explicitly.

We cannot use attributes for this as function calls happen in the middle of the code, so there is no place to stick attributes into. So we are going to use higher level functions to wrap the function calls.

The first case is easy. If the developer thinks that the caller of his code has no way to recover from all the exceptions thrown by a function, he can prepend his function call with the ‘fault’ word as in:

fault parseUser userText

That signals readers of the code that the developer is willing to propagate up all the exceptions thrown by the function parseUser. Embarrassingly, ‘fault’ is implemented as:

let fault f = f

So it is just a tag. Things get trickier when the function has contingencies. We want to find a way to manage them without introducing undue complexity in the code.

We’d like to catch some exceptions thrown by the function and convert them to return values and then either return such return values or manage the contingency immediately after the function call. On top of that, we’d want all of the code written after the function call to appear as clean as if no error management were taking place. Monads (computation values) can be used to achieve these goals.

Last time we introduced a type to represent error return values:

type Result<'a, 'b> =
| Success of 'a
| Failure of 'b
type UserFetchError =
| UserNotFound  of exn
| NotAuthorized of int * exn 

We can then create a computation expression that ‘abstracts out’ the Failure case and let you write the code as cleanly as if you were not handling errors. Let’s call such thing ‘critical’. Here is how the final code looks like:

let tryFetchUser3 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    critical {
        let Unauthorized (ex:exn) = NotAuthorized (ex.Message.Length, ex)
        let! userText = contingent1
                            [FileNotFoundException()        :> exn, UserNotFound;
                             UnauthorizedAccessException()  :> exn, Unauthorized]
                            dbQuery (userName + ".user")
        
        return fault parseUser userText
    }

You can compare this with the code you would have to write without the ‘critical’ library (from last post):

let tryFetchUser1 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    // Could check for file existence in this case, but often not (i.e. db)
    let userResult =    try
                            Success(dbQuery(userName + ".user"))
                        with
                        | :? FileNotFoundException as ex        -> Failure(UserNotFound ex)
                        | :? UnauthorizedAccessException as ex  -> Failure(NotAuthorized(2, ex))
                        | ex                                    -> reraise ()
    
    match userResult with
    | Success(userText) -> 
        let user        = Success(parseUser(userText))
        user
    | Failure(ex)       -> Failure(ex)

And with the original (not critical) function:

let fetchUser userName =
    let userText            = dbQuery (userName + ".user")
    let user                = parseUser(userText)
    user

Let’s go step by step and see how it works. First of all, you need to enclose the Critical parts of your code (perhaps your whole program) in a ‘critical’ computation:

    critical {
}

This allows you to call functions that return a Result and manage the return result as if it were the successful result. If an error were generated, it would be returned instead. We will show how to manage contingencies immediately after the function call later.

The above is illustrated by the following:

        let! userText = contingent1
                            [FileNotFoundException()        :> exn, UserNotFound;
                             UnauthorizedAccessException()  :> exn, Unauthorized]
                            dbQuery (userName + ".user")

Here ‘contingent1’ is a function that returns a Result, but userText has type string. The Critical monad, and in particular the usage of ‘let!’ is what allows the magic to happen.

‘contingentN’ is a function that you call when you want to manage certain exceptions thrown by a function as contingencies. The N part represents how many parameters the function takes.

The first parameter to ‘contingent1’ is a list of pairs (Exception, ErrorReturnConstructor). That means: when an exception of type Exception is thrown, return the result of calling ‘ErrorReturnConstructor(Exception)’ wrapped inside a ‘Failure’ object. The second parameter to ‘contingent1’ is the function to invoke and the third is the argument to pass to it.

Conceptually, ‘ContingentN’ is a tag that says: if the function throws one of these exceptions, wrap them in these return values and propagate all the other exceptions. Notice that Unauthorized takes an integer and an exception as parameters while the ErrorReturnConstructor takes just an exception. So we need to add this line of code:

        let Unauthorized (ex:exn) = NotAuthorized (ex.Message.Length, ex) 

After the contingent1 call, we can then write code as if the function returned a normal string:

        return fault parseUser userText

This achieves that we set up to do at the start of the series:

  • Contingencies are now explicit in the signature of tryFetchUser3
  • The developer needs to indicate for each function call how he intend to manage contingencies and faults
  • The code is only slightly more complex than the non-critical one

You can also decide to manage your contingencies immediately after calling a function. Perhaps there is a way to recover from the problem. For example, if the user is not in the database, you might want to add a standard one:

let createAndReturnUser userName = critical { return {Name = userName; Age = 43}}
let tryFetchUser4 userName = if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty" critical { let Unauthorized (ex:exn) = NotAuthorized (ex.Message.Length, ex) // depends on ex let userFound = contingent1 [FileNotFoundException() :> exn, UserNotFound; UnauthorizedAccessException() :> exn, Unauthorized] dbQuery (userName + ".user") match userFound with | Success(userText) -> return fault parseUser userText | Failure(UserNotFound(_)) -> return! createAndReturnUser(userName) | Failure(x) -> return! Failure(x) }

The only difference in this case is the usage of ‘let’ instead of ‘let!’. This exposes the real return type of the function allowing you to pattern match against it.

Sometimes a simple exception to return value mapping might not be enough and you want more control on which exceptions to catch and how to convert them to return values. In such cases you can use contingentGen:

let tryFetchUser2 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    critical {
        let! userText = contingentGen
                            (fun ex -> ex :? FileNotFoundException || ex :? UnauthorizedAccessException)
                            (fun ex ->
                                match ex with
                                       | :? FileNotFoundException       -> UserNotFound(ex)
                                       | :? UnauthorizedAccessException -> NotAuthorized(3, ex)
                                       | _ -> raise ex)
                            (fun _ -> dbQuery (userName + ".user"))
        
        return fault parseUser userText
    }

The first parameter is a lambda describing when to catch an exception. The second lambda translate between exceptions and return values. The third lambda represents which function to call.

Sometimes you might want to catch all the exceptions that a function might throw and convert them to a single return value:

type GenericError = GenericError of exn

 // 1. Wrapper that prevents exceptions for escaping the method by wrapping them in a generic critical result
let tryFetchUserNoThrow userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    critical {
        let! userText = neverThrow1 GenericError dbQuery (userName + ".user")
        
        return fault parseUser userText
    }

And sometimes you might want to go the opposite way. Given a function that exposes some contingencies, you want to translate them to faults because you don’t know how to recover from them.

let operateOnExistingUser userName =
    let user = alwaysThrow GenericException tryFetchUserNoThrow userName
    ()

Next time we’ll look at how the Critical computation expression is implemented.

Exceptions vs. Return Values to represent errors (in F#) – II– An example problem

In the previous post, we talked about the difference between Critical and Normal code. In this post we are going to talk about the Critical code part. Ideally, we want:

  • A way to indicate that a particular piece of code (potentially the whole program) is Critical
  • A way to force/encourage the programmer to make an explicit decision on the call site of a function on how he wants to manage the error conditions (both contingencies and faults)
  • A way to force/encourage the programmer to expose contingencies/faults that are appropriate for the conceptual level of the function the code is in (aka don’t expose implementation details for the function,  i.e. don’t throw SQLException from a getUser method where the caller is supposed to catch it)

Remember that I can use the word ‘force’ here because the programmer has already taken the decision to analyse each line of code for error conditions. As we discussed in the previous post, In many/most cases, such level of scrutiny is unwarranted.

Let’s use the below scenario to unravel the design:

type User = {Name:string; Age:int}
let fetchUser userName =
    let userText            = dbQuery (userName + ".user")
    let user                = parseUser(userText)
    user

This looks like a very reasonable .NET function and it is indeed reasonable in Normal code, but not in Critical code. Note that the caller likely needs to handle the user-not-in-repository case because there is no way for the caller to check such condition beforehand without incurring the performance cost of two network roundtrips.

Albeit the beauty and simplicity, there are issues with this function in a Critical context:

  • The function throws implementation related exceptions, breaking encapsulation when the user needs to catch them
  • It is not clear from the code if the developer thought about error management (do you think he did?)
  • Preconditions are not checked, what about empty or null strings?

To test our design let’s define a fake dbQuery:

let dbQuery     = function
    | "parseError.user"     -> "parseError"
    | "notFound.user"       -> raise (FileNotFoundException())
    | "notAuthorized.user"  -> raise (UnauthorizedAccessException())
    | "unknown.user"        -> failwith "Unknown error reading the file"
    | _                     -> "FoundUser"

The first two exceptions are contingencies, the caller of fetchUser is supposed to manage them. The unknown.user exception is a fault in the implementation. parseError triggers a problem in the parseUser function.

ParseUser looks like this:

let parseUser   = function
    | "parseError"          -> failwith "Error parsing the user text"
    | u                     -> {Name = u; Age = 43}

Let’s now create a test function to test the different versions of fetchUser that we are going to create:

let test fetchUser =
    let p x                 = try printfn "%A" (fetchUser x) with ex -> printfn "%A %s" (ex.GetType()) ex.Message
    p "found"
    p "notFound"
    p "notAuthorized"
    p "parseError"
    p "unknown"

Running the function exposes the problems described above. From the point of view of the caller, there is no way to know what to expect by just inspecting the signature of the function. There is no differentiation between contingencies and faults. The only way to achieve that is to catch some implementation-specific exceptions.

How would we translate this to Critical code?

First, we would define a type to represent the result of a function:

type Result<'a, 'b> =
| Success of 'a
| Failure of 'b

This is called the Either type, but the names have been customized to represent this scenario. We then need to define which kind of contingencies our function could return.

type UserFetchError =
| UserNotFound  of exn
| NotAuthorized of int * exn

So we assume that the caller can manage the fact that the user is not found or not authorized. This type contains an Exception member.  This is useful in cases where the caller doesn’t want to manage a contingency, but wants to treat it like a fault (for example when some Normal code is calling some Critical code).

In such cases, we don’t lose important debugging information. But we still don’t break encapsulation because the caller is not supposed to ‘catch’ a fault.

Notice that NotAuthorized contains an int member. This is to show that contingencies can carry some more information than just their type. For example, a caller could match on both the type and the additional data.

With that in place, let’s see how the previous function looks like:

let tryFetchUser1 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    // Could check for file existence in this case, but often not (i.e. db)
    let userResult =    try
                            Success(dbQuery(userName + ".user"))
                        with
                        | :? FileNotFoundException as ex        -> Failure(UserNotFound ex)
                        | :? UnauthorizedAccessException as ex  -> Failure(NotAuthorized(2, ex))
                        | ex                                    -> reraise ()
    
    match userResult with
    | Success(userText) -> 
        let user        = Success(parseUser(userText))
        user
    | Failure(ex)       -> Failure(ex)

Here is what changed:

  • Changed name to tryXXX to convey the fact that the method has contingencies
  • Added precondition test, which generates a fault
  • The signature of the function now conveys the contingencies that the user is supposed to know about

But still, there are problems:

  • The code became very long and convoluted obfuscating the success code path
  • Still, has the developer thought about the error conditions in parseUser and decided to let exceptions get out, or did she forget about it?

The return value crowd at this point is going to shout: “Get over it!! Your code doesn’t need to be elegant, it needs to be correct!”. But I disagree, obfuscating the success code path is a problem because it becomes harder to figure out if your business logic is correct. It is harder to know if you solved the problem you set out to solve in the first place.

In the next post we’ll see what we can do about keeping beauty and being correct.

Exceptions vs. Return Values to represent errors (in F#) – I – Conceptual view

Recently I’ve been reading numerous articles on the age old question of exceptions vs. return values. There is a vast literature on the topic with very passionate opinions on one side or the other. Below is my view on it.

First of all, I’ll define my terms.

  • Success code path: chunk of code that is responsible to perform the main task of a function, without any regard for error conditions
  • Contingency: an event that happens during the success code path execution that is of interest for the caller of the function.
    • It happens rarely
    • It can be described in terms that the caller understands, it doesn’t refer to the implementation of the function
    • The caller stands a chance to recover from it
  • Fault: an event that happens during the success code path execution that is not expected by the caller of the function.
    • It happens rarely
    • It cannot be described in terms that the caller understands, it requires reference to the implementation of the function
    • The caller has no way to recover from it

Examples of the above for a FetchUser(userName) function:

  • Success code path: the code that retrieves the user from the database
  • Contingency: the fact that the requested user is not in the database
  • Fault: userName = null, divide by zero, stack overflow, …

The difference between Contingency and Fault is not sharp in practice and requires common sense, but it is useful none less. When in doubt, it would appear prudent to consider an event as a Contingency, so that the caller gets a chance to recover.

Ideally, you would like a Contingency to be part of the signature of a function, so that the caller knows about it. On the other end, a Fault shouldn’t be part of the signature of a function for two reasons:

  • The user cannot recover from it
  • Being part of the signature would break encapsulation by exposing implementation details of the function

The above seems to suggest that Contingencies should be represented as return values and Faults as exceptions. As an aside, in Java the former is represented as checked exceptions, which is part of the signature. We’ll tackle checked exceptions later on.

An important point that is often neglected in the discussions on this topic is that there are two categories of applications: applications that care about Contingencies (Critical apps) and applications that don’t (Normal apps). I am of the opinion that the latter category is the largest.

In many cases you can indeed write just the success code path and, if anything goes wrong, you just clean up after yourself and exit. That is a perfectly reasonable thing to do for very many applications. You are trading off speed of development with stability.  Your application can be anywhere on that continuum.

Examples of Normal apps are: build scripts, utility applications, departmental applications where you can fix things quickly on the user machine, intranet web sites, internet web sites that are purely informative, etc …

Examples of Critical apps are: servers, databases, operating systems, web site that sell stuff,  etc …

For Normal apps, treating Contingencies as Fault is the right thing to do. You just slap a try … catch around your event loop/ thread/ process and you do your best to get the developer to fix the problem quickly. I think a lot of the angst of the ‘return value crowd’ is predicated on not having this distinction in mind. They are making very valid point regarding Critical apps to a crowd that is thinking about Normal apps. So the two sides are cross-talking.

Also, in my opinion, the main problem with Java checked exceptions is that they make writing Normal apps as cumbersome as writing Critical apps. So, reasonably, people complain.

The .NET framework decided to use Exceptions as the main way to convey both Faults and Contingencies. By doing so, it makes it easier to write Normal apps, but more difficult to write Critical apps.

For a Critical app or section of code, you’d want to:

  • Express contingencies in the function signature
  • Force/Encourage people to take an explicit decision at the calling site on how they want to manage both Contingencies and Faults for each function call

In the next post, let’s see how we can represent some of this in F#.