Exceptions vs. Return Values to represent errors (in F#) – III–The Critical monad

Code for this post is here.

In the last post we looked at some Critical code and decided that, albeit correct, it is convoluted. The error management path obfuscates the underlying logic. Also we have no way of knowing if a developer had thought about the error path or not when invoking a function.

Let’s tackle the latter concern first as it is easier. We want the developer to declaratively tag each method call with something that represents his intent about managing the Contingencies or Faults of the function.  Moreover if the function has contingencies, we want to force the developer to manage them explicitly.

We cannot use attributes for this as function calls happen in the middle of the code, so there is no place to stick attributes into. So we are going to use higher level functions to wrap the function calls.

The first case is easy. If the developer thinks that the caller of his code has no way to recover from all the exceptions thrown by a function, he can prepend his function call with the ‘fault’ word as in:

fault parseUser userText

That signals readers of the code that the developer is willing to propagate up all the exceptions thrown by the function parseUser. Embarrassingly, ‘fault’ is implemented as:

let fault f = f

So it is just a tag. Things get trickier when the function has contingencies. We want to find a way to manage them without introducing undue complexity in the code.

We’d like to catch some exceptions thrown by the function and convert them to return values and then either return such return values or manage the contingency immediately after the function call. On top of that, we’d want all of the code written after the function call to appear as clean as if no error management were taking place. Monads (computation values) can be used to achieve these goals.

Last time we introduced a type to represent error return values:

type Result<'a, 'b> =
| Success of 'a
| Failure of 'b
type UserFetchError =
| UserNotFound  of exn
| NotAuthorized of int * exn 

We can then create a computation expression that ‘abstracts out’ the Failure case and let you write the code as cleanly as if you were not handling errors. Let’s call such thing ‘critical’. Here is how the final code looks like:

let tryFetchUser3 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    critical {
        let Unauthorized (ex:exn) = NotAuthorized (ex.Message.Length, ex)
        let! userText = contingent1
                            [FileNotFoundException()        :> exn, UserNotFound;
                             UnauthorizedAccessException()  :> exn, Unauthorized]
                            dbQuery (userName + ".user")
        
        return fault parseUser userText
    }

You can compare this with the code you would have to write without the ‘critical’ library (from last post):

let tryFetchUser1 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    // Could check for file existence in this case, but often not (i.e. db)
    let userResult =    try
                            Success(dbQuery(userName + ".user"))
                        with
                        | :? FileNotFoundException as ex        -> Failure(UserNotFound ex)
                        | :? UnauthorizedAccessException as ex  -> Failure(NotAuthorized(2, ex))
                        | ex                                    -> reraise ()
    
    match userResult with
    | Success(userText) -> 
        let user        = Success(parseUser(userText))
        user
    | Failure(ex)       -> Failure(ex)

And with the original (not critical) function:

let fetchUser userName =
    let userText            = dbQuery (userName + ".user")
    let user                = parseUser(userText)
    user

Let’s go step by step and see how it works. First of all, you need to enclose the Critical parts of your code (perhaps your whole program) in a ‘critical’ computation:

    critical {
}

This allows you to call functions that return a Result and manage the return result as if it were the successful result. If an error were generated, it would be returned instead. We will show how to manage contingencies immediately after the function call later.

The above is illustrated by the following:

        let! userText = contingent1
                            [FileNotFoundException()        :> exn, UserNotFound;
                             UnauthorizedAccessException()  :> exn, Unauthorized]
                            dbQuery (userName + ".user")

Here ‘contingent1’ is a function that returns a Result, but userText has type string. The Critical monad, and in particular the usage of ‘let!’ is what allows the magic to happen.

‘contingentN’ is a function that you call when you want to manage certain exceptions thrown by a function as contingencies. The N part represents how many parameters the function takes.

The first parameter to ‘contingent1’ is a list of pairs (Exception, ErrorReturnConstructor). That means: when an exception of type Exception is thrown, return the result of calling ‘ErrorReturnConstructor(Exception)’ wrapped inside a ‘Failure’ object. The second parameter to ‘contingent1’ is the function to invoke and the third is the argument to pass to it.

Conceptually, ‘ContingentN’ is a tag that says: if the function throws one of these exceptions, wrap them in these return values and propagate all the other exceptions. Notice that Unauthorized takes an integer and an exception as parameters while the ErrorReturnConstructor takes just an exception. So we need to add this line of code:

        let Unauthorized (ex:exn) = NotAuthorized (ex.Message.Length, ex) 

After the contingent1 call, we can then write code as if the function returned a normal string:

        return fault parseUser userText

This achieves that we set up to do at the start of the series:

  • Contingencies are now explicit in the signature of tryFetchUser3
  • The developer needs to indicate for each function call how he intend to manage contingencies and faults
  • The code is only slightly more complex than the non-critical one

You can also decide to manage your contingencies immediately after calling a function. Perhaps there is a way to recover from the problem. For example, if the user is not in the database, you might want to add a standard one:

let createAndReturnUser userName = critical { return {Name = userName; Age = 43}}
let tryFetchUser4 userName = if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty" critical { let Unauthorized (ex:exn) = NotAuthorized (ex.Message.Length, ex) // depends on ex let userFound = contingent1 [FileNotFoundException() :> exn, UserNotFound; UnauthorizedAccessException() :> exn, Unauthorized] dbQuery (userName + ".user") match userFound with | Success(userText) -> return fault parseUser userText | Failure(UserNotFound(_)) -> return! createAndReturnUser(userName) | Failure(x) -> return! Failure(x) }

The only difference in this case is the usage of ‘let’ instead of ‘let!’. This exposes the real return type of the function allowing you to pattern match against it.

Sometimes a simple exception to return value mapping might not be enough and you want more control on which exceptions to catch and how to convert them to return values. In such cases you can use contingentGen:

let tryFetchUser2 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    critical {
        let! userText = contingentGen
                            (fun ex -> ex :? FileNotFoundException || ex :? UnauthorizedAccessException)
                            (fun ex ->
                                match ex with
                                       | :? FileNotFoundException       -> UserNotFound(ex)
                                       | :? UnauthorizedAccessException -> NotAuthorized(3, ex)
                                       | _ -> raise ex)
                            (fun _ -> dbQuery (userName + ".user"))
        
        return fault parseUser userText
    }

The first parameter is a lambda describing when to catch an exception. The second lambda translate between exceptions and return values. The third lambda represents which function to call.

Sometimes you might want to catch all the exceptions that a function might throw and convert them to a single return value:

type GenericError = GenericError of exn

 // 1. Wrapper that prevents exceptions for escaping the method by wrapping them in a generic critical result
let tryFetchUserNoThrow userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    critical {
        let! userText = neverThrow1 GenericError dbQuery (userName + ".user")
        
        return fault parseUser userText
    }

And sometimes you might want to go the opposite way. Given a function that exposes some contingencies, you want to translate them to faults because you don’t know how to recover from them.

let operateOnExistingUser userName =
    let user = alwaysThrow GenericException tryFetchUserNoThrow userName
    ()

Next time we’ll look at how the Critical computation expression is implemented.

Advertisements

Exceptions vs. Return Values to represent errors (in F#) – II– An example problem

In the previous post, we talked about the difference between Critical and Normal code. In this post we are going to talk about the Critical code part. Ideally, we want:

  • A way to indicate that a particular piece of code (potentially the whole program) is Critical
  • A way to force/encourage the programmer to make an explicit decision on the call site of a function on how he wants to manage the error conditions (both contingencies and faults)
  • A way to force/encourage the programmer to expose contingencies/faults that are appropriate for the conceptual level of the function the code is in (aka don’t expose implementation details for the function,  i.e. don’t throw SQLException from a getUser method where the caller is supposed to catch it)

Remember that I can use the word ‘force’ here because the programmer has already taken the decision to analyse each line of code for error conditions. As we discussed in the previous post, In many/most cases, such level of scrutiny is unwarranted.

Let’s use the below scenario to unravel the design:

type User = {Name:string; Age:int}
let fetchUser userName =
    let userText            = dbQuery (userName + ".user")
    let user                = parseUser(userText)
    user

This looks like a very reasonable .NET function and it is indeed reasonable in Normal code, but not in Critical code. Note that the caller likely needs to handle the user-not-in-repository case because there is no way for the caller to check such condition beforehand without incurring the performance cost of two network roundtrips.

Albeit the beauty and simplicity, there are issues with this function in a Critical context:

  • The function throws implementation related exceptions, breaking encapsulation when the user needs to catch them
  • It is not clear from the code if the developer thought about error management (do you think he did?)
  • Preconditions are not checked, what about empty or null strings?

To test our design let’s define a fake dbQuery:

let dbQuery     = function
    | "parseError.user"     -> "parseError"
    | "notFound.user"       -> raise (FileNotFoundException())
    | "notAuthorized.user"  -> raise (UnauthorizedAccessException())
    | "unknown.user"        -> failwith "Unknown error reading the file"
    | _                     -> "FoundUser"

The first two exceptions are contingencies, the caller of fetchUser is supposed to manage them. The unknown.user exception is a fault in the implementation. parseError triggers a problem in the parseUser function.

ParseUser looks like this:

let parseUser   = function
    | "parseError"          -> failwith "Error parsing the user text"
    | u                     -> {Name = u; Age = 43}

Let’s now create a test function to test the different versions of fetchUser that we are going to create:

let test fetchUser =
    let p x                 = try printfn "%A" (fetchUser x) with ex -> printfn "%A %s" (ex.GetType()) ex.Message
    p "found"
    p "notFound"
    p "notAuthorized"
    p "parseError"
    p "unknown"

Running the function exposes the problems described above. From the point of view of the caller, there is no way to know what to expect by just inspecting the signature of the function. There is no differentiation between contingencies and faults. The only way to achieve that is to catch some implementation-specific exceptions.

How would we translate this to Critical code?

First, we would define a type to represent the result of a function:

type Result<'a, 'b> =
| Success of 'a
| Failure of 'b

This is called the Either type, but the names have been customized to represent this scenario. We then need to define which kind of contingencies our function could return.

type UserFetchError =
| UserNotFound  of exn
| NotAuthorized of int * exn

So we assume that the caller can manage the fact that the user is not found or not authorized. This type contains an Exception member.  This is useful in cases where the caller doesn’t want to manage a contingency, but wants to treat it like a fault (for example when some Normal code is calling some Critical code).

In such cases, we don’t lose important debugging information. But we still don’t break encapsulation because the caller is not supposed to ‘catch’ a fault.

Notice that NotAuthorized contains an int member. This is to show that contingencies can carry some more information than just their type. For example, a caller could match on both the type and the additional data.

With that in place, let’s see how the previous function looks like:

let tryFetchUser1 userName =
    if String.IsNullOrEmpty userName then invalidArg "userName" "userName cannot be null/empty"

    // Could check for file existence in this case, but often not (i.e. db)
    let userResult =    try
                            Success(dbQuery(userName + ".user"))
                        with
                        | :? FileNotFoundException as ex        -> Failure(UserNotFound ex)
                        | :? UnauthorizedAccessException as ex  -> Failure(NotAuthorized(2, ex))
                        | ex                                    -> reraise ()
    
    match userResult with
    | Success(userText) -> 
        let user        = Success(parseUser(userText))
        user
    | Failure(ex)       -> Failure(ex)

Here is what changed:

  • Changed name to tryXXX to convey the fact that the method has contingencies
  • Added precondition test, which generates a fault
  • The signature of the function now conveys the contingencies that the user is supposed to know about

But still, there are problems:

  • The code became very long and convoluted obfuscating the success code path
  • Still, has the developer thought about the error conditions in parseUser and decided to let exceptions get out, or did she forget about it?

The return value crowd at this point is going to shout: “Get over it!! Your code doesn’t need to be elegant, it needs to be correct!”. But I disagree, obfuscating the success code path is a problem because it becomes harder to figure out if your business logic is correct. It is harder to know if you solved the problem you set out to solve in the first place.

In the next post we’ll see what we can do about keeping beauty and being correct.

Exceptions vs. Return Values to represent errors (in F#) – I – Conceptual view

Recently I’ve been reading numerous articles on the age old question of exceptions vs. return values. There is a vast literature on the topic with very passionate opinions on one side or the other. Below is my view on it.

First of all, I’ll define my terms.

  • Success code path: chunk of code that is responsible to perform the main task of a function, without any regard for error conditions
  • Contingency: an event that happens during the success code path execution that is of interest for the caller of the function.
    • It happens rarely
    • It can be described in terms that the caller understands, it doesn’t refer to the implementation of the function
    • The caller stands a chance to recover from it
  • Fault: an event that happens during the success code path execution that is not expected by the caller of the function.
    • It happens rarely
    • It cannot be described in terms that the caller understands, it requires reference to the implementation of the function
    • The caller has no way to recover from it

Examples of the above for a FetchUser(userName) function:

  • Success code path: the code that retrieves the user from the database
  • Contingency: the fact that the requested user is not in the database
  • Fault: userName = null, divide by zero, stack overflow, …

The difference between Contingency and Fault is not sharp in practice and requires common sense, but it is useful none less. When in doubt, it would appear prudent to consider an event as a Contingency, so that the caller gets a chance to recover.

Ideally, you would like a Contingency to be part of the signature of a function, so that the caller knows about it. On the other end, a Fault shouldn’t be part of the signature of a function for two reasons:

  • The user cannot recover from it
  • Being part of the signature would break encapsulation by exposing implementation details of the function

The above seems to suggest that Contingencies should be represented as return values and Faults as exceptions. As an aside, in Java the former is represented as checked exceptions, which is part of the signature. We’ll tackle checked exceptions later on.

An important point that is often neglected in the discussions on this topic is that there are two categories of applications: applications that care about Contingencies (Critical apps) and applications that don’t (Normal apps). I am of the opinion that the latter category is the largest.

In many cases you can indeed write just the success code path and, if anything goes wrong, you just clean up after yourself and exit. That is a perfectly reasonable thing to do for very many applications. You are trading off speed of development with stability.  Your application can be anywhere on that continuum.

Examples of Normal apps are: build scripts, utility applications, departmental applications where you can fix things quickly on the user machine, intranet web sites, internet web sites that are purely informative, etc …

Examples of Critical apps are: servers, databases, operating systems, web site that sell stuff,  etc …

For Normal apps, treating Contingencies as Fault is the right thing to do. You just slap a try … catch around your event loop/ thread/ process and you do your best to get the developer to fix the problem quickly. I think a lot of the angst of the ‘return value crowd’ is predicated on not having this distinction in mind. They are making very valid point regarding Critical apps to a crowd that is thinking about Normal apps. So the two sides are cross-talking.

Also, in my opinion, the main problem with Java checked exceptions is that they make writing Normal apps as cumbersome as writing Critical apps. So, reasonably, people complain.

The .NET framework decided to use Exceptions as the main way to convey both Faults and Contingencies. By doing so, it makes it easier to write Normal apps, but more difficult to write Critical apps.

For a Critical app or section of code, you’d want to:

  • Express contingencies in the function signature
  • Force/Encourage people to take an explicit decision at the calling site on how they want to manage both Contingencies and Faults for each function call

In the next post, let’s see how we can represent some of this in F#.

Retrieving SQL Server data with type providers and exposing it with ASP.NET Web APIs in F#

For a good introduction on how to use Web APIs in F#, read here. The starting point for type providers is here. This post is about how I solved a practical problem using these technologies.

First, let’s discuss the scenario. In my company, we needed to log usage information for our various applications to a central repository and build a web site to access such information. I went through three different architectures for such a requirement, ending with the set of technologies described in the title.

I found the intersection of the SQL type provider with the ASP.NET Web Api to be very sweet. Personally, I think this mix is much better than using the WFC Data Services, because the code you have to write and the infrastructure to maintain is significantly less.

I suspect that the F# team and ASP.Net team didn’t talk to each other. It all happens because of a well defined interface (IQueryable) that both teams happen to work against.

1st version, heavy Javascript

·         SQL Server Express backend

·         WFC Data Services middle tier in C# (autogenerated REST service from a table, you can query from the JS code)

·         Plenty of Javascript in the browser, calling back to the REST web services through Ajax

 

Here is an example of the kind of manipulation done in the JS layer:

 

function extractVersions(sessions) {

    var versions = {}

    $.each(sessions, function (i, s) {

        if (typeof versions[s.Version] == ‘undefined’) {

            versions[s.Version] = 0

        } else {

            versions[s.Version] = versions[s.Version] + 1

        }

    })

    var vs = []

    for (var prop in versions) {

        if (versions.hasOwnProperty(prop))

            vs.push({ Version: prop, Sessions: versions[prop] })

    }

    return vs;

}

 

Pros:

·         Back-end autogenerated

·         I’ve learned JS pretty well

Cons:

·         Plenty of autogenerated code (crap) to debug into when things go wrong

·         Difficult to customize the back-end

·         Somehow slow, even if today’s JS engines are pretty good

·         More time consuming to create and maintain JS code (compared to F# or C#)

·         Aesthetically unsatisfying, as you really would like to do most manipulations on the server

 

2nd version, SOAP like web services

·         SQL Server Express backend

·         SOAP like web services middle tier in C# (coded by hand), still using the WCF Data Services infrastructure

·         Little Javascript in the browser exclusively for presentation, calling back to the REST web services through Ajax

 

Here is how one of the web services would look like:

 

        [WebGet]

        public IEnumerable<VersionSessions> GetVersionsForMonth(string application,
                                                                bool isQS, bool isAD)

        {

            var sessions = GetSessionsForMonth(application, isQS, isAD);

 

            var hash = new Dictionary<string, VersionSessions>();

 

            foreach (var s in sessions)

            {

                VersionSessions vs;

                var found = hash.TryGetValue(s.Version, out vs);

                if (!found)

                    hash[s.Version] = new VersionSessions { Version = s.Version,
                                                                       Sessions = 0 };

                else

                    vs.Sessions += 1;

            }

            return hash.Values;

        }

 

Pros:

·         Maintainable business logic code (C# instead of JS)

·         Somehow faster

·         Easier to customize the back-end (just add another web service with the signature you want)

Cons:

·         Still plenty of manipulation code (at least now on the back-end), not very RESTy

·         Feels that the webservice are very ad-hoc for this particular presentation (i.e. a ViewModel, not a model)

·         Still not trivial to customize the back-end logic, mostly because I was using WCF Data Services, which are opaque

 

3rd version, using SQL Views and F# type providers

·         Moved the whole app to a dedicated VM

·         SQL Server backend with a significant layer of Views and inline table-value functions

·         Proper REST web services created through ASP.NET Web API and F# type providers

·         Little Javascript in the browser exclusively for presentation, calling back to the REST web services through Ajax

·         Moved all external ‘reference file’ (i.e. people in QS) to the DB. Before I was keeping business info in the database and config info in files.

 

Here is an example of an inline table value function:

 

ALTER FUNCTION [dbo].[FUsers] (@App nvarchar(40))

RETURNS TABLE

AS

RETURN

(

SELECT     UserAlias, MAX(Application) AS Application, MAX(Start) AS LastSessionStart, MAX(Finish) AS LastSessionFinish, DATEDIFF(MINUTE, MAX(Start), MAX(Finish))

                      AS LastSessionTotalTime, COUNT(*) AS Sessions, MAX(Machine) AS Machine, MAX(Version) AS Version, MAX(Qs) AS Qs

FROM         dbo.VTotalSessions AS s

WHERE     (TotalTime > 30) and (Application = @App)

GROUP BY UserAlias)

 

And here is an example of a type provider based web service. Most of my code works like this:

 

type usersController() =

    inherit ApiController()

 

    let db = dbSchema.GetDataContext()

   

    member x.Get() =

        query {

            for u in db.VUsers do

            select u

        }

    member x.Get(app) =

        query {

            for u in db.FUsers(app) do

            select u

        }

 

    member x.Post (value:string) = ()

    member x.Put (id:int) (value:string) = ()

    member x.Delete (id:int) = ()

 

Here is how a query is represented in JS (the whole query gets passed through to SQL, both the JS part and the F# part, IQueryable magick):

 

function loadOpenSessions(app, qs,  cont) {

    var query = new oQuery()

        .From("/users/" + app)

        .Let("lastMonth", lastMonth)

        .Let("twoMinutesAgo", twoMinutesAgo)

        .Where("item => item.LastSessionFinish > $twoMinutesAgo")

        .Orderby("UserAlias")

    commonCall(query, qs, cont)

 

}

 

Pros:

·         Purely functional business logic code (SQL), very easy to debug problems by just running the query/view

·         Maximum usage of SQL Server optimizer. As they say in the SQL Engine Team: “we’ve spent decades optimizing sql query engines. You are unlikely to do better with your ‘for’ loops …”

·         Very easy to customize the back-end, just write F# code to implement GET/POST/PUT/DELETE etc…

·         Moving all the state for the app (reference files included) in the DB, makes it much easier to integrate it into the business logic. It all becomes a big query for the optimizer to sort out.

·         No autogenerated code anywhere in the architecture

·         More separation between model and viewmodel. Tables are model, Views are ViewModel, F# is just a way to expose such view model to the world at large

Cons:

·       Routing algorithm for ASP.NET Web API is mysterious. It’s the only black box piece in the architecture (aka that I don’t understand it). 

·        Sometimes either SQL is not powerful enough (I abhor Stored procedures on religious grounds) or something doesn’t map well to a REST metaphor. In such cases I have diluted the metaphor as in the code below. The good thing is that, being just code, I can do that. I’m not blocked.

type statsController() =

    inherit ApiController()

 

    let db = dbSchema.GetDataContext()

 

    member x.Get(app:string, isQs:string, call:string) =

        match call with

        | "WeeklyUsers" ->

            let sessions = query {

                for s in db.VTotalSessionsAndWeeks do

                where (s.Application = app && s.Qs = Nullable (Int32.Parse(isQs)))

                select s

            }

            weeklyUsers sessions

        | "WeeklyTime"  ->

            let sessions = query {

                for s in db.FTotalTime(app, Nullable (Int32.Parse(isQs))) do

                select s

            }

            weeklyTime sessions

        | "HourlyUsers"->

            let users = query {

                for u in db.FHourlyUsers(app, Nullable (Int32.Parse(isQs))) do

                select u }

            hourlyUsers users

        | _ -> failwith "No call with that name"

Writing functional code in C++ V – Miscellaneous and conclusions

Just a couple of trivialities and my parting thoughts.

Nested functions

If your language has lambdas, you don’t need nested functions support because you can implement them using it.

I am a heavy user of nested functions, but I’m of two minds about it. On one side, I like that they sit close to where they are used, avoiding going outside the main function body to understand them. I also like that you don’t need to pass a lot of parameters to them, as they capture the function locals. On the other side, they end up creating the impression that your functions are very long and so, in my eyes, they occasionally reduce readability. The IDE helps you out there (at least VS 11) by allowing you to collapse the lambdas.

An example of trivial case is below:

BOOST_AUTO_TEST_CASE(NestedFunctions)
{
    int x = 3;
    auto sumX = [&] (int y) { return x + y;};

    BOOST_CHECK_EQUAL(sumX(2), 3+2);
}

And here is a more realistic one (not written in a functional style), where readability is reduced by the three nested functions (among many other things):

bool condor(
            boost::gregorian::date now,  // date to evaluate 
            double spot,                 // spot price underlying
            double v,                    // ATM volatility
            double r,                    // risk free rate
            double step,                 // % of spot price to keep as distance between wings
            int minCallShortDist,        // min distance from the short call strike in steps
            int minPutShortDist,         // min distance from the short put strike in steps
            int minDays,                 // min number of days to expiry
            int maxDays,                 // max number of days to expiry
            double maxDelta,             // max acceptable delta value for shorts in steps
            double minPremium,           // min accepted premium as % of step
            Condor& ret                  // return value
            )
{
    // convert params to dollar signs
    auto stepPr            = round(step * spot);
    auto toUSD             = [stepPr] (double x) { return round(stepPr * x);};
    auto minCpr            = toUSD( minCallShortDist );
    auto minPpr            = toUSD( minPutShortDist );
    auto premiumPr         = toUSD( minPremium );

    // calc strike values for short legs
    auto atm               = round(spot / stepPr) * (long) stepPr;
    auto callShort         = atm + minCpr;
    auto putShort          = atm - minPpr;
    
    auto addDays           = [](boost::gregorian::date d, int dys) -> boost::gregorian::date {
        using namespace boost::gregorian;

        auto toAdd         = days(dys);
        auto dTarget       = d + toAdd;

        return  dTarget;
    };

    // calc min & max allowed expiry dates
    auto minDate           = addDays(now, minDays);
    auto maxDate           = addDays(now, maxDays);
    auto expiry            = calcExpiry(now, 0);

    // find first good expiry
    while(expiry < minDate)
        expiry          = calcExpiry(expiry, +1);

    Greeks g;
    auto scholes           = [=, &g] (double strike, int days, bool CorP) {
        return blackScholesEuro(spot, strike, days, CorP, v, r, g, true);
    };

    // find a condor that works at this expiry
    auto findCondor        = [=, &g, &ret] (int days) -> bool {
        ret.shortCallStrike                = callShort;
        ret.shortPutStrike                 = putShort;
        auto shCallPremium                 = 0.0;
        auto shPutPremium                  = 0.0;

        // find short call strike price < maxDelta
        while(true) {
            shCallPremium                  = scholes(ret.shortCallStrike, days, true);
            if(g.delta <= maxDelta)
                break;
            else
                ret.shortCallStrike        += stepPr;
        }

        // find short put strike price < maxDelta
        while(true) {
            shPutPremium                   = scholes(ret.shortPutStrike, days, false);
            if( (- g.delta) <= maxDelta)
                break;
            else
                ret.shortPutStrike         -= stepPr;
        }

        // check premium is adeguate
        ret.longCallStrike                = ret.shortCallStrike + stepPr;
        ret.longPutStrike                 = ret.shortPutStrike  - stepPr;
        auto lgCall                       = scholes(ret.longCallStrike, days, true);
        auto lgPut                        = scholes(ret.longPutStrike,  days, false);
        ret.netPremium                    = shCallPremium + shPutPremium - lgCall - lgPut;

        return ret.netPremium > premiumPr;
    };

    // increases the expiry until it finds a condor or the expiry is too far out
    while (expiry < maxDate) {
        auto days        = (expiry - now).days();
        if(findCondor(days)) {
            ret.year     = expiry.year();
            ret.month    = expiry.month();
            ret.day      = expiry.day();
            return true;
        }
        expiry           = calcExpiry(expiry, +1);
    }

    return false;
}

That is quite a mouthful, isn’t it? But really the function is not that long. It is all these nested functions that makes it seems so.

Tuples

Tuples are another feature toward which I have mixed feelings. They are really useful to return multiple results from a function and for rapid prototyping of concepts. But they are easy to abuse. Creating Records is almost always a better, safer way to craft your code, albeit more verbose.

The standard C++ has an excellent tuple library that makes working with them almost as easy as in mainstream functional languages. The below function shows creating them, getting their value, passing them to functions and deconstructing them:

BOOST_AUTO_TEST_CASE(Tuples)
{
    auto t = make_tuple("bob", "john", 3, 2.3);
    BOOST_CHECK_EQUAL(get<0>(t), "bob");
    BOOST_CHECK_EQUAL(get<2>(t), 3);

    // yep, compiler error
    //auto k = get<10>(t);

    auto t2(t);
    BOOST_CHECK(t2 == t);

    // passing as argument, returning it
    auto f = [] (tuple<string, string, int, double> t) { return t;};
    auto t1 = f(t);
    BOOST_CHECK(t == t1);

    // automatic deconstruction
    string s1; string s2; int i; double d;
    std::tie(s1, s2, i, d) = f(t);
    BOOST_CHECK_EQUAL(s1, "bob");

    // partial reconstruction
    string s11;
    std::tie(s11, ignore, ignore, ignore) = f(t);
    BOOST_CHECK_EQUAL(s11, "bob");
}

Conclusion

I’m sure there are some fundamental functional features that I haven’t touched on (i.e. currying, more powerful function composition, etc…).  Despite that, I think we have enough material here to start drawing some conclusions. To start with, where is C++ deficient compared to mainstream functional languages for the sake of writing functional code ?

  • Compiler errors: if you make a well natured error, you often get back from the compiler a long page of rubbish if you are using templates (and you are). You put your goggles on and start swimming through it to find the nugget of gold that is the root cause of your problem. 
    I can hear you C++ gurus screaming: come on, come on, it’s not that bad. After a while you get used to it. You become proficient in rubbish swimming. That makes me think of the story of that guy who lived in a strange neighbourhood where, as soon as you open the door of you house, someone kicks you in the nuts. His wife suggested that they moved to a better part of town, but the man replied: “no worries, I got used to it”.
  • Syntax oddity: we did our best to make the gods of syntax happy in this series of articles, but we are still far removed from Haskell, F#, etc beauty…
  • Advanced features: if you go beyond the basics of functional languages, there is a plethora of interesting features that just make your life so much easier and bring your code to an higher level of abstraction. Things like monads (i.e. f# async workflow), type providers, Haskell’s lazy execution, Haskell’s super strong type system, Closure’s transactional memory etc… Boost is trying to bring many of these things to C++, but there is still a big gap to be filled.
  • Temptation to cheat: you can cheat in many functional programming languages (i.e. having mutable variables), but nowhere the temptation is as strong as in C++. You need an iron discipline not to do it. The fact that sometimes cheating is the right thing to do makes it even worse.

Overall, am I going to use C++ now as my main programming language? Not really. I don’t have one of those. My general view is to always use the best language for the job at hand (yes, I even use Javascript for web pages).

C++ has some unique characteristics (i.e. speed, cross compilation & simple C interfacing). I’m going to use it whenever I need these characteristics, but now I’ll be a happier user knowing that I can write decent functional code in it.

Writing functional code in C++ IV – Algebraic datatypes

And here comes the guilt bit. I have the strong suspicion (but not certainty) that what I am doing here can be done with templates, but didn’t take the time to do it. With that out of the way, let’s go.

Code for this post is here. Thanks to Steve Bower and Andy Sawyer for reviewing it.

Algebraic datatypes (discriminated unions in F#) are a powerful concept in functional programming. They are the main way to represent type variation in your program. Very roughly, where object orientation uses derivation, functional programming uses algebraic datatypes. An entire book could be written on the theory of this, but the goal of this post is to see how we can map them to C++ without loosing C++ness.

When talking about this with C++ programmers, they always point me to boost variant. That doesn’t do it for me for several reasons.

First of all, boost variants represent one of a fixed collection of types. Algebraic datatypes represent one of a fixed collection of named types. That means that a simple thing like the code below cannot be represented as variant:

type LivingEntity =
| Person of string  // name
| Dog of string     // name

Ok, ok maybe you could represent it by ‘typifing’ things using boost strong typedef, but things get ugly syntactically. Moreover, a lot of time the name is all you care about …

type Switch = On | Off

Are we going to strong typedef for such a simple thing? Oh boy. Even accepting the syntactic weight of all this, there are other issues in play. Discriminated unions are used extensively in functional programs. So you want a nice syntax to work with them Something like the below F# syntax:

let print living =
    match living with
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

Which could be made even sweeter by using the ‘function’ keyword as below:

let print = function
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

In boost variant, you either use the get<type> functions or you write a visitor function. In the first case you are probably going to write a chain of ‘if’ statements or a ‘switch’ statement. Both are confusing and come with plenty of syntactic weight. I don’t really want to write a visitor like the one below for each ‘match’ in my code. The magic is gone.

class times_two_visitor
    : public boost::static_visitor<>
{
public:

    void operator()(int & i) const
    {
        i *= 2;
    }

    void operator()(std::string & str) const
    {
        str += str;
    }

};

Ok, so boost variant doesn’t really work for this. Remember that our overarching goal was to stay close to C++. The language itself has something that comes pretty close to what we want in the form of a union, or better a tagged union. Again, the types are not named, but maybe we can work that in.

It turns out that Jared here did all the hard work. The general idea is to use macros to hide the construction of a tagged union with methods to test the type and return the contained value.

For example this code:

DU_DECLARE(LivingEntity)
    DU_VALUE(Person,    string)
    DU_VALUE(Dog,       string)
DU_END

Becomes something like:

struct LivingEntity {
    private:
        LivingEntity() {}
        unsigned int m_kind;
    public:
        static LivingEntity Person(const string& value) {
            LivingEntity unionValue;
            unionValue.m_kind = 19;
            unionValue.m_Person = value;
            return unionValue; }

        bool IsPerson() const {
            return m_kind == 19;
        }
        const string& GetPerson() const {
            (void)( (!!(m_kind == 19)) || (_wassert(L"m_kind == __LINE__", L"c:discriminated_union.cpp", 19), 0) );
            return m_Person; }
        string GetPerson() {
            (void)( (!!(m_kind == 19)) || (_wassert(L"m_kind == __LINE__", L"c:discriminated_union.cpp", 19), 0) );
            return m_Person; }
   private:
string m_Person;

You can see the outline of a tagged union (i.e. m_kind) with a constructor for each type (i.e. Person) and methods to test for at type and return its value. You can also see the storage for the value (i.e. m_Person).

The only thing in DU_DECLARE that is different from Jared’s solution is the typedef below that allows not repeating the LivingEntity name in each DU_VALUE.

#define DU_DECLARE(name)                        \
    struct name {                               \
    private:                                    \
        typedef name unionName;                 \
        name() {}                               \
        unsigned int m_kind;                    \
    public:
#define DU_VALUE(entryName, entryType)                                                                      \
        static unionName entryName(const entryType& value) {                                                \
            unionName unionValue;                                                                           \
            unionValue.m_kind = __LINE__;                                                                   \
            unionValue.m_##entryName = value;                                                               \
            return unionValue;  }                                                                           \
        bool Is##entryName() const { return m_kind == __LINE__;}                                            \
        const entryType& Get##entryName() const { assert(m_kind == __LINE__); return m_##entryName; }       \
        entryType Get##entryName() { assert(m_kind == __LINE__); return m_##entryName; }                    \
    private:                                                                                                \
        entryType m_##entryName;                                                                            \
    public:

With all of that at your disposal it becomes easy to write:

    auto entity = LivingEntity::Dog("Bob");

    DU_MATCH(entity)
        DU_CASE(Dog,   BOOST_CHECK_EQUAL(value, "Bob");)
        DU_CASE(Person,BOOST_CHECK(false);) 
    DU_MATCH_END

There are some beautiful things about this. First of all, the construction of any of such types is super simple. You even get intellisense!

Moreover the ‘value’ variable contains whatever was passed in the constructor for the object. So this is semantically equivalent, if not syntactically, to the match statement in F#.

Obviously the code part is not limited to a single instruction:

    DU_MATCH(entity)
        DU_CASE(Dog,
            cout << "I should be here";
            BOOST_CHECK_EQUAL(value, "Bob");
        )
        DU_CASE(Person,
            BOOST_CHECK(false);
        ) 
    DU_MATCH_END

And for those of you addicted to braces, I venture:

    DU_MATCH(entity)
        DU_CASE(Dog,
        {
            cout << "I should be here";
            BOOST_CHECK_EQUAL(value, "Bob");
        })
        DU_CASE(Person,
        {
            BOOST_CHECK(false);
        }) 
    DU_MATCH_END

They all work with the same macro definition. They expand to something along the line of:

    if(false) {}
        else if(entity.IsDog()) {
            auto value = entity.GetDog();
            BOOST_CHECK_EQUAL(value, "Bob");
        }
        else if(entity.IsPerson()) {
            auto value = entity.GetPerson();
            BOOST_CHECK(false);
        } 
        else {
            throw match_exception();
        }

I’ve not reached the pinnacle of macro naming mastering with this one. Making them lowercase and risking a bit more on the conflict side would make the syntax much more palatable. I call it, as it is, not too bad.

The last ‘else’ clause assures you then if you add a new type to the discriminated union and forget to update one of the ‘MATCH’ clauses at least you get a run time error. That is not good. Functional languages would give you a compile time error, which is much better. Maybe with judicious use of templates you can bring the error at compile time.

The macros are trivial:

class match_exception: std::exception {};

#define DU_MATCH(unionName) { auto du_match_var = unionName; if(false) {}

#define DU_CASE_TAG(entry, ...)                        \
    else if(du_match_var.Is##entry()) {                \
        __VA_ARGS__                                    \
    }

#define DU_CASE(entry, ...)                            \
    else if(du_match_var.Is##entry()) {                \
        auto value = du_match_var.Get##entry();        \
        __VA_ARGS__                                    \
    }

#define DU_DEFAULT(...)                                \
    else if(true) { __VA_ARGS__}

#define DU_MATCH_END else {throw new match_exception();} }

Let’s now go back to our initial goal and see how far off we are. We were trying to do the following:

type LivingEntity =
| Person of string  
| Dog of string  


let print = function
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

And here is what we ended up with:

DU_DECLARE(LivingEntity)
    DU_VALUE(Person,    string)
    DU_VALUE(Dog,        string)
DU_END

auto print(LivingEntity en) -> void {
    DU_MATCH(entity)
        DU_CASE(Dog,    cout << "I'm a dog named " << value;)
        DU_CASE(Person, cout << "I'm a per named " << value;) 
    DU_MATCH_END
}

In our Switch case:

type Switch = On | Off

You get the good looking :

DU_DECLARE(Switch)
    DU_FLAG(On)
    DU_FLAG(Off)
DU_END

And along the way we lost compile time type safety in the very common case of adding new types to the discriminated union. That’s bad.

Also some of you would strongly dislike the (ab)use of macros. As for me, it looks workable.