Functional programming in C – Implementation

0.1 Cleanup

Let’s start simple with the cleanup function. First we need the usual barrage of includes. G_BEGIN_DECLS allows the header to be linked in C++.


#include "glib.h"


#include <stdlib.h>
#include <stdio.h>

This feature is GCC specific. It uses __attribute((cleanup(f))) where f is the cleanup function. In this case the cleanup function just frees the memory.

#ifdef __GNUC__

 static inline void __autofree(void *p) {
     void **_p = (void**)p;

auto_clean is a building block that you can use to plug in your own cleanup function. In the common case of memory allocation, I created a wrapper macro auto_free to make it even easier.

#define auto_clean(f)   __attribute((cleanup(f)))
#define auto_free       auto_clean(__autofree)

0.2 Lambdas

I took this one from here.

If you think about it, a lambda is just an expression that returns a function. This macro creates a nested function, called fn, inside a statement expression and returns it. Unfortunately these features are gcc specific.

Remember that lambdas are not allocated on the heap, so you have to be careful on how you used them.

#define lambda(return_type, function_body)                                          
    return_type __fn__ function_body                                                


0.3 Unions

A union type is what you would expect: a struct that contains an unnamed union and a field to specify which type it is. We need the list of types in union_decl to create the kind enum. The usage of __VA_ARGS__ allows to use whatever syntax you want to go into the enum (i.e. specify int values).

Having to specify the the types here is unfortunate as you are going to need to specify it in the union_case macros as well.

I haven’t found another way to do it. If you do, let me know.

#define union_decl(alg, ...)                                                        
typedef struct alg {                                                                
    enum {  __VA_ARGS__ } kind;                                                     
    union {

You specify each type for the union with union_type. That looks pretty good to me.

#define union_type(type, ...)                                                       
    struct type { __VA_ARGS__ } type;

Ideally you shouldn’t need to specify alg here. Perhaps there is a way to avoid doing so.

#define union_end(alg)                                                              
    };} alg;

You can then set the fields on the union type by using the below macro. Notice the usage of the new struct constructor here to allow optional named parameters.

This is a statement, so it cannot go into an expression place. I think I could make it an expression that returns the existing (or a new) union. This is going to be a sceanrio if people are not using gcc statement expressions.

#define union_set(instance, type, ...)                                              
    G_STMT_START {                                                                  
        (instance)->kind     = (type);                                              
        (instance)->type   = (struct type) { __VA_ARGS__ };                         
    } G_STMT_END

This is an utility macro. It is a version of g_assert that you can use in an expression position.

#define g_assert_e(expr) (                                                          
    (G_LIKELY (!expr) ?                                                             
   (void)g_assertion_message_expr (G_LOG_DOMAIN, __FILE__, __LINE__, G_STRFUNC, 
    : (void) 1) )

And this allows to fill the default case in a match statement implemented as a ternary operator. It prints out a text representation of the expression and returns it.

#define union_fail(...) (g_assert_e(((void)(__VA_ARGS__) , false)), (__VA_ARGS__))

The rest of the code is commented out. It is a macro way to do pattern matching. For me, the ternary operator is simpler, but I left it there in case you want to play with it.

#define union_case_only_s(instance, type, ...)                                      
        G_STMT_START {                                                              
        if((instance)->kind == (type)) {                                            
            G_GNUC_UNUSED struct type* it = &((instance)->type); __VA_ARGS__; }     
        else g_assert_not_reached();                                                
        } G_STMT_END

#define union_case_first_s(alg, instance, type, ...)                                
    G_STMT_START {                                                                  
        alg* private_tmp = (instance);                                              
        if(private_tmp->kind == type) {                                             
            G_GNUC_UNUSED struct type* it = &((private_tmp)->type); __VA_ARGS__; }

#define union_case_s(type, ...)                                                     
        else if(private_tmp->kind == type) {                                        
            G_GNUC_UNUSED struct type* it = &((private_tmp)->type); __VA_ARGS__; }

#define union_case_last_s(type, ...)                                                
        else if(private_tmp->kind == type) {                                        
            G_GNUC_UNUSED struct type* it = &((private_tmp)->type); __VA_ARGS__; }  
            else g_assert_not_reached(); } G_STMT_END

#define union_case_default_s(...)                                                   
        else __VA_ARGS__; } G_STMT_END

// Need to use assert here because g_assert* cannot be used in expressions as it expands to do .. while(0)
#define union_case_only(instance, type, ...)                                        
        ( (instance)->kind == (type) ? (__VA_ARGS__) : (assert(false), __VA_ARGS__) )

#define union_case_first(instance, type, ...)                                       
        ( (instance)->kind == (type) ? (__VA_ARGS__) :

#define union_case(instance, type, ...)                                             
        (instance)->kind == (type) ? (__VA_ARGS__) :

#define union_case_last(instance, type, ...)                                        
        (instance)->kind == (type) ? (__VA_ARGS__) : (assert(false), (__VA_ARGS__)) )
#define union_case_default(...)                                                     
        (__VA_ARGS__) )




Functional programming in C

This post/program (as I’m writing it in literate style) is a continuation of my previous posts about functional programming in C++. I promise I’m not going to post about doing it in assembly language (I think) ….

I came to like the simplicity of C very much and got interested in how you could write functional code in it.

There is one irritating thing about C as a viable programming language. Microsoft’s compiler support is not good. It just supports ANSI C, not C99 or C11. So, if you want to use more modern idyoms, you got to use gcc or clang. In this post I assume you use gcc. I will point out the gcc specific extensions.

Also, the C standard library is pretty limited, so I decided to use GLib to complement it. I also created some macros to simplify the syntax. I never understood why people like templates and think macros are evil. It takes me all of 5 minutes to do a -E on GCC to debug the result of a macro expansion. With templates, well, that’s different.

So, in summary, this post is about how you can write functional code in C, perhaps with some gcc extensions and certainly with some macro tricks. Let’s call it funkyC (thanks Ian ). I’m going to show how to use it first. Next post I’m going to show how it’s implemented.

0.1 Discriminated unions in C

With a bit of macro magic, you can get a decent looking discriminated union syntax. First we need to include the correct headers. lutils.h is where all the macros are defined.

#include <glib.h>
#include <stdio.h>
#include <assert.h>
#include <stdbool.h>
#include <signal.h>
#include <string.h>

#include "lutils.h"

Then you can declare a discriminated union with the syntax below. It suffers from two problems: repeting the list of possible types in union_decl and repeating the name of the discriminated union in union_end. Perhaps there is a way to avoid that, but I haven’t found it.

The syntax for the union_type call is the same as you would use inside a struct declaration. We’ll see how this works when we look at lutils.h.

union_decl  (Car, Volvo, Fiat, Ferrari)
union_type      (Volvo,     int x; double y;)
union_type      (Fiat,      char* brand, *model;)
union_type      (Ferrari,   char* brand, *model;)
union_end   (Car)

We can create a Car either on the stack, as below, or on the heap and we can set its value with union_set.

Notice the usage of the new struct construction syntax to simulate optional named parameters in C. I would prefer not to have a dot at the start of the name, but overall it is beautiful (if I can say that myself).

static void printCar(Car*);

static void testUnion() {
    Car c;

    union_set   (&c, Volvo, .x = 3, .y = 4);
    printCar    (&c);
    union_set   (&c, Ferrari, .brand = "Ferrari");
    printCar    (&c);
    union_set   (&c, Fiat, .brand = "Fiat", .model = "234");
    printCar    (&c);

You can then access values inside your discriminated union with normal if statements.

static void testCar(Car*, char const *);

static void printCar(Car* c) {

    if(c->kind == Volvo) {
        int x = c->Volvo.x;
        g_assert_cmpint(x, ==, 3);

Or perhaps you want the equivalent of a match statement in F# (aka an expression that returns a value based on the type of the discriminated union). Notice that, as logical, all the expressions need to return the same type. That’s why union_fail takes a value of the expression type.

    char temp[40];

    char* value =   c->kind == Volvo    ?   itoa(c->Volvo.x, temp, 10)
                  : c->kind == Ferrari  ?   (void)c->Ferrari.model, c->Ferrari.brand
                  : c->kind == Fiat     ?   c->Fiat.model
                                        :   union_fail("Not a valid car type");

If you are willing to be gcc specific, then your expression can be comprised of multiple statements, of which the last one returns the value of the expression. This allows a much more flexible syntax for your match clauses.

#ifdef __GNUC__

    value       =   c->kind == Volvo    ? ({
                                            struct Volvo* it = &c->Volvo;
                                            itoa(it->x, temp, 10);
                  : c->kind == Ferrari  ?   (void)c->Ferrari.model, c->Ferrari.brand
                  : c->kind == Fiat     ?   c->Fiat.model
                                        :   union_fail("Not a valid car type");

    testCar(c, value);

#endif // __GNUC__

We then use the super simple test framework in GLib to make sure that it all works as expected …

static void testCar(Car* c, char const * value) {
    if(c->kind == Volvo) g_assert_cmpstr(value, ==, "3");
    else if (c->kind == Fiat) g_assert_cmpstr(value, ==, "234");
    else if (c->kind == Ferrari) g_assert_cmpstr(value, ==, "Ferrari");
    else g_assert_not_reached();


0.2 Nested functions and lambda variables

GCC has many other cool extensions. A very simple one is nested functions. It allows you to nest functions :-) Look at the definition of doA and f2 in the function below. Putting together nested functions and block statement expressions allows you, with some macro magic, to define lambda functions in your code (from here ).

Remember that lambdas (aka nested functions) are allocated on the stack. They are very fast, but you cannot store their pointer into a gloal table (unless such table is used while the stack for this function is alive).

In such cases, you have to create a proper function. But for the other 90% of use cases, they work pretty well. They are lambdas in the spirit of C: very fast, but error prone …

#ifdef __GNUC__

static void testLambda() {

    typedef int (*aFunc) (int);

    aFunc doA(aFunc f){

        int k(int i) {
            return f(i) + 3;
        return k;

    int clos = 2;

    int f2 (int i) {return i;}
    aFunc b = doA(lambda (int, (int p) {return p + clos;}));

    g_assert_cmpint(b(3), ==, 8);

0.3 Automatic cleanup of local variables

This is not a functional topic per se, but something that always annoyed me tremendously about C. The fact that you cannot define the equivalent of the using statement in C#, or destructors in C++. Well, now you can. Or not?

Again, if you are willing to be GCC specific, you can use an attribute (more on this in the upcoming implementation post) to associate a cleanup function that gets called when your variable goes out of scope. In this case, I wrapped the free case in a nice looking macro.

But that doesn’t really work. You would certainly want such function to be called on any kind of exit from the enclosing scope (aka via exit(), abort() or longjmp()). Alas, that doesn’t happen.

This reduces the usefulness of this mechanism tremendously. Probably too much in that it lulls you into a false sense of security. You still need to free your resources in the error path of your application.

static void testAutomaticCleanup() {
    char* stack_alloc() {
        auto_free char* b = g_malloc(10000);
        memset(b, '#', 10000);
        return b;

    char * c = stack_alloc();
    g_assert(*c != '#');


0.4 Data structures

GLib comes with a vast library of data structures to use, not too different from the .NET framework or Java. For example, below you have a single linked list …

static void testGLib() {
     GSList* list = NULL;

     list = g_slist_append(list, "Three");
     list = g_slist_prepend(list, "first");
     g_assert_cmpint(g_slist_length(list), ==, 2);

     list = g_slist_remove(list, "first");
     g_assert_cmpint(g_slist_length(list), ==, 1);


0.5 Wrapping up

There you go, rising the level of abstraction of C, still keeping it very fast (if you are willing to be gcc bound).

There are other features in functional programming languages that are not in this post. Maybe I’ll get around to macro my way into them eventually, maybe not.

In the next post we’ll talk about how all of this is implemented. Below is the code for running the testcases.

int runTests(int argc, char* argv[]) {
    g_test_init(&argc, &argv, NULL);

    if(g_test_quick()) {
        g_test_add_func("/utils/automatic_cleanup", testAutomaticCleanup);
        g_test_add_func("/utils/lambda", testLambda);
        g_test_add_func("/utils/Union", testUnion);
        g_test_add_func("/utils/SList", testGLib);

    return g_test_run();

int main(int argc, char *argv[]) {
    return runTests(argc, argv);

LLite : language friendly literate programming

1 Main ideas

The code for this post is here. The source used to generate it is here. I also attached a pdf file to give an idea of the final result.

My interest in literate programming comes from some realizations on my part:

  • When I go back to code that I have written some time ago, I don’t remember my reasoning
  • When I write a blog post, my code seems to be better. Perhaps explaining things to people encourages me to be more precise
  • I like to think top down, but the compiler forces me to write code bottom up, starting from details and going to higher level concepts

1.1 Unhappiness with existing tools

Many of the existing literate programming tools work similarly to the original CWeb.

  • They have a tangle program that goes over your file and extract something that the compiler can understand
  • They have a weave program that extracts from your file something that the document generator can understand

This scheme has the unfortunate limitation of breaking your code editor. Given that your file is not a valid code file anymore, the editor starts misbehaving (i.e. intellisense breaks). The debugger starts to get confused (albeit people tried to remediate that with cleaver use of #line. If your language has an interactive console, that would not work either.

1.2 A different interpretation

The main idea of this program is to add your narrative to the comment part of a code file by extending the comment tag (i.e. in C you could use /** ). This keeps editor, debugger and interactive console working.

The weave phase as been retained and what you are reading is the program that goes over your code file and extracts a nicely formatted (for this program in markdown format) file that can then be translated to HTML, PDF, latex, etc…

You got that? The document you are reading now is the program.

1.3 Multi-language, multi-document format

LLite works for any programming language, assuming it has open and close comment character sequences, and any documentation format, assuming it has open and close code character sequences (aka allows you to delimitate your code somehow), or it needs the code to be indented. This document uses markdown (with Pandoc extensions to generate table of contents and titles).

1.4 Usage

You invoke the program as documented below. The first set of parameters lets you choose the symbols that delimitate your language comments (or the default symbols below). The second set of parameters lets you choose how your target documentation language treats code. Either it delimits it with some symbols or it indents it.

module LLite

let langParamsTable     = [ "fsharp", ("(*" + "*", "*" + "*)") // The + is not to confuse the parser
                            "c", ("/**", "**/")
                            "csharp", ("/**", "**/")
                            "java", ("/**", "**/")] |> Map.ofList

let languages = langParamsTable |> Map.fold (fun state lang _ -> state + lang + " ") ""

let usage   = sprintf @"
Usage: llite inputFile parameters
One of the following two sets of parameters is mandatory
    -no string : string opening a narrative comment
    -nc string : string closing a narrative comment
    -l language: where language is one of (%s)

One of the following two sets of parameters is mandatory
    -co string : string opening a code block
    -cc string : string closing a code block
    -indent N  : indent the code by N whitespaces

The following parameters are optional:
    -o outFile : defaults to the input file name with mkd extension" languages

let getLangNoNC lang    =
    match Map.tryFind lang langParamsTable with
    | Some(no, nc) -> no, nc
    | None -> failwith (lang + " is not a valid programming language")

1.5 Programming Languages limitations

One of the main tenets of literate programming is that the code should be written in the order that facilitates exposition to a human reader, not in the order that makes the compiler happy. This is very important.

If you have written a blog post or tried to explain a codebase to a new joiner, you must have noticed that you don’t start from the top of the file and go down, but jump here and there trying to better explain the main concepts. Literate programming says that you should write your code the same way. But in our version of it, the compiler needs to be kept happy because the literate file is the code file.

Some ingenuity is required to achieve such goal:

  • In C and C++ you can forward declare functions and classes, also class members can be in any order
  • In C#, Java, VB.NET, F# (the object oriented part) you can write class members in any order
  • In the functional part of F# you do have a problem (see later in this doc)

The F# trick below is used in the rest of the program. You’ll understand its usage naturally by just reading the code

let declare  = ref Unchecked.defaultof

2 Implementation

At the core, this program is a simple translator that takes some code text and return a valid markdown/whatever text. We need to know:

  • The strings that start and end a narrative comment (input symbols)
  • How to translate a code block into a document. We support these variations:
    • Indented: indent them by N spaces
    • Surrounded by startCode/endCode strings
type CodeSymbols =
    | Indent of int                 // indentation level in whitespaces
    | Surrounded of string * string // start code * end code

type Options = {
    startNarrative  : string
    endNarrative    : string
    codeSymbols     : CodeSymbols

let translate   = declare string -> string>

2.1 Going over the parse tree

We need a function that takes a string and returns a list with the various blocks. We can then go over each block, perform some operations and, in the end, transform it back to text

type Block =
| Code      of string
| Narrative of string

let blockize = declare string -> Block list>

I could have used regular expressions to parse the program, but it seemed ugly. I could also have used FsParsec, but that brings with it an additional dll. So I decided to roll my own parser. This has several problems:

  • It is probably very slow
  • It doesn’t allow narrative comments inside comments, in particular it doesn’t allow the opening comment
  • It doesn’t allow opening comments in the program code (not even inside a string)

The latter in particular is troublesome. You’ll need to use a trick in the code (i.e. concatenating strings) to foul this program in not seeing an opening comment, but it is inconvenient.

With all of that, it works.

TODO: consider switching to FsParsec

2.1.1 Lexer

The lexer is going to process list of characters. We need functions to check if a list of characters starts with certain chars and to return the remaining list after having removed such chars.

BTW: these functions are polymorphic and tail recursive

let rec startWith startItems listToCheck =
    match startItems, listToCheck with
    | [], _             -> true
    | _ , []            -> false
    | h1::t1, h2::t2  when h1 = h2  -> startWith t1 t2
    | _, _              -> false

let rec remove itemsToRemove listToModify =
    match itemsToRemove, listToModify with
    | [], l             -> l
    | _ , []            -> failwith "Remove not defined on an empty list"
    | h1::t1, h2::t2  when h1 = h2  -> remove t1 t2
    | _, _              -> failwith "itemsToRemove are not in the list"

let isOpening options       = startWith (List.ofSeq options.startNarrative) 
let isClosing options       = startWith (List.ofSeq options.endNarrative)
let remainingOpen options   = remove (List.ofSeq options.startNarrative)
let remainingClose options  = remove (List.ofSeq options.endNarrative)

This is a pretty basic tokenizer. It just analyzes the start of the text and returns what it finds. It also keeps track of the line number for the sake of reporting it in the error message.

let NL = System.Environment.NewLine

type Token =
| OpenComment   of int
| CloseComment  of int
| Text          of string

let tokenize options source =

    let startWithNL = startWith (Seq.toList NL)

    let rec text line acc = function
        | t when isOpening options t    -> line, acc, t 
        | t when isClosing options t    -> line, acc, t
        | c :: t as full                ->
            let line' = if startWithNL full then line + 1 else line
            text line' (acc + c.ToString()) t
        | []                            -> line, acc, [] 
    let rec tokenize' line acc = function
        | []                            -> List.rev acc
        | t when isOpening options t    -> tokenize' line
                                            (OpenComment(line)::acc)  (remainingOpen options t)
        | t when isClosing options t    -> tokenize' line
                                            (CloseComment(line)::acc) (remainingClose options t)
        | t                             ->
            let line, s, t'= text line "" t
            tokenize' line (Text(s) :: acc) t'

    tokenize' 1 [] (List.ofSeq source)

2.1.2 Parser

The parse tree is just a list of Chunks, where a chunk can be a piece of narrative or a piece of code.

type Chunk =
| NarrativeChunk    of Token list
| CodeChunk         of Token list

let parse options source =

    let rec parseNarrative acc = function
        | OpenComment(l)::t         ->
            failwith ("Don't open narrative comments inside narrative comments at line "
                                                                                    + l.ToString())
        | CloseComment(_)::t        -> acc, t
        | Text(s)::t                -> parseNarrative (Text(s)::acc) t
        | []                        -> failwith "You haven't closed your last narrative comment"

    let rec parseCode acc = function
        | OpenComment(_)::t as t'   -> acc, t'
        | CloseComment(l)::t        -> parseCode (CloseComment(l)::acc) t
        | Text(s)::t                -> parseCode (Text(s)::acc) t
        | []                        -> acc, []
    let rec parse' acc = function
        | OpenComment(_)::t         ->
            let narrative, t' = parseNarrative [] t
            parse' (NarrativeChunk(narrative)::acc) t' 
        | Text(s)::t                ->
            let code, t' = parseCode [Text(s)] t
            parse' (CodeChunk(code)::acc) t'
        | CloseComment(l)::t           ->
            failwith ("Don't insert a close narrative comment at the start of your program at line "
                                                                                    + l.ToString())
        | []                -> List.rev acc

    parse' [] (List.ofSeq source)

2.1.3 Flattener

The flattening part of the algorithm is a bit unusual. At this point we have a parse tree that contains tokens, but we want to reduce it to two simple node types containing all the text in string form.

TODO: consider managing nested comments and comments in strings (the latter has to happen in earlier phases)

let flatten options chunks =
    let tokenToStringNarrative = function
    | OpenComment(l) | CloseComment(l)  -> failwith ("Narrative comments cannot be nested at line "
                                                                                    + l.ToString())
    | Text(s)                           -> s

    let tokenToStringCode = function
    | OpenComment(l)                -> failwith ("Open narrative comment cannot be in code at line"
                                                                + l.ToString()) +
                                                 ". Perhaps you have an open comment in" +
                                                 " a code string before this comment tag?"
    | CloseComment(_)               -> string(options.endNarrative |> Seq.toArray)
    | Text(s)                       -> s

    let flattenChunk = function
    | NarrativeChunk(tokens)             ->
        Narrative(tokens |> List.fold (fun state token -> state + tokenToStringNarrative token) "")
    | CodeChunk(tokens)                  ->
        Code(tokens |> List.fold (fun state token -> state + tokenToStringCode token) "")

    chunks |> List.fold (fun state chunk -> flattenChunk chunk :: state) [] |> List.rev

We are getting there, now we have a list of blocks we can operate upon

blockize := fun options source -> source |> tokenize options |> parse options |> flatten options

2.2 Narrative comments phases

Each phase is a function that takes the options and a block list and returns a block list that has been processed in some way.

type Phase = Options -> Block List -> Block List

let removeEmptyBlocks   = declare
let mergeBlocks         = declare
let addCodeTags         = declare

let processPhases options blockList = 
    |> !removeEmptyBlocks   options
    |> !mergeBlocks         options
    |> !addCodeTags         options

We want to manage how many newlines there are between different blocks, because we don’t trust the programmer to have a good view of how many newline to keep from comment blocks and code blocks. We’ll trim all newlines from the start and end of a block, and then add our own.

let newLines = [|'\n';'\r'|]

type System.String with
    member s.TrimNl () = s.Trim(newLines) 

2.2.1 Remove empty blocks

There might be empty blocks (i.e. between two consecutive comment blocks) in the file. For the sake of formatting the file beautifully, we want to remove them.

let extract = function
    | Code(text)        -> text
    | Narrative(text)   -> text

removeEmptyBlocks := fun options blocks ->
                        blocks |> List.filter (fun b -> (extract b).TrimNl().Trim()  "")

2.2.2 Merge blocks

Consecutive blocks of the same kind need to be merged, for the sake of formatting the overall text correctly.

TODO: make tail recursive

let rec mergeBlockList = function
    | []        -> []
    | [a]       -> [a]
    | h1::h2::t -> match h1, h2 with
                   | Code(t1), Code(t2)             -> mergeBlockList (Code(t1 + NL + t2)::t)
                   | Narrative(n1), Narrative(n2)   -> mergeBlockList(Narrative(n1 + NL + n2)::t)
                   | _, _                           -> h1::mergeBlockList(h2::t)

mergeBlocks := fun options blocks -> mergeBlockList blocks

2.2.3 Adding code tags

Each code block needs a tag at the start and one at the end or it needs to be indented by N chars.

let indent n (s:string) =
    let pad = String.replicate n " "
    pad + s.Replace(NL, NL + pad)

addCodeTags := fun options blocks ->
    match options.codeSymbols with
    | Indent(n)         ->
        blocks |> (function Narrative(s) as nar -> nar | Code(s) -> Code(indent n s))
    | Surrounded(s, e)  -> 
        blocks |> (function
                            | Narrative(text)   -> Narrative(NL + text.TrimNl() + NL)
                            | Code(text)        -> Code(NL + s + NL + text.TrimNl() + NL + e + NL))

2.2.4 Flatten again

Once we have the array of blocks, we need to flatten them (transform them in a single string), which is trivial, and then finally implement our original translate function.

let sumBlock s b2 = s + extract b2

let flattenB blocks = (blocks |> List.fold sumBlock "").TrimStart(newLines)

translate := fun options text -> text |> !blockize options |> processPhases options |> flattenB

2.3 Parsing command line arguments

Parsing command lines involves writing a function that goes from a sequence of strings to an input file name, output file name and Options record

let parseCommandLine = declare string * string * Options>

To implement it, we are going to use a command line parser taken from here. The parseArgs function takes a sequence of argument values and map them into a (name,value) tuple. It scans the tuple sequence and put command name into all subsequent tuples without name and discard the initial (,) tuple. It then groups tuples by name and converts the tuple sequence into a map of (name,value seq)

For now, I don’t need the value seq part as all my parameters take a single argument, but I left it in there in case I will need to pass multiple args later on.

open  System.Text.RegularExpressions

let (|Command|_|) (s:string) =
  let r = new Regex(@"^(?:-{1,2}|\/)(?\w+)[=:]*(?.*)$",RegexOptions.IgnoreCase)
  let m = r.Match(s)
  if m.Success
    Some(m.Groups.["command"].Value.ToLower(), m.Groups.["value"].Value)

let parseArgs (args:string seq) =
  |> (fun i -> 
                    match i with
                    | Command (n,v) -> (n,v) // command
                    | _ -> ("",i)            // data
  |> Seq.scan (fun (sn,_) (n,v) -> if n.Length>0 then (n,v) else (sn,v)) ("","")
  |> Seq.skip 1
  |> Seq.groupBy (fun (n,_) -> n)
  |> (fun (n,s) -> (n, s |> (fun (_,v) -> v) |> Seq.filter (fun i -> i.Length>0)))
  |> Map.ofSeq

let paramRetrieve (m:Map) (p:string) = 
  if Map.containsKey p m
  then Some(m.[p])
  else None

This is the main logic of parameter passing. Note that we give precedence to the -l and -indent parameters, if present.

This is a function that goes from the map of command line parameters to the input file name, output file name and options. With that we can finally define the original parseCommandLine.

let safeHead errMsg s = if s |> Seq.isEmpty then failwith errMsg else s |> Seq.head 

let paramsToInputs paramsMap =
    let single p er     = match paramRetrieve paramsMap p with | Some(k) -> Some(k |> safeHead er)
                                                               | None -> None
    let get p s         = match paramRetrieve paramsMap p with |Some(k) -> k |> safeHead s
                                                               | None -> failwith s
    let defaultP p q er = match paramRetrieve paramsMap p with | Some(k) -> k |> safeHead er
                                                               | None -> q

    let inputFile       = get "" "You need to pass an input file"
    let outputFile      = defaultP  "o"
                                    (System.IO.Path.GetFileNameWithoutExtension(inputFile) + ".mkd")
                                    "You must pass a parameter to -o"

    let no, nc          = match single "l" "You must pass a language parameter to -l" with
                          | Some(l) -> getLangNoNC l
                          | None    ->
                                get "no" "The no (narrative open) parameter is mandatory, if no -l specified",
                                get "nc" "The nc (narrative close) parameter is mandatory, if no -l specified"

    let codeSymbs       = match single "indent" "You must pass a whitespace indentation number to -indent" with
                          | Some(n)     ->
                                let success, value = System.Int32.TryParse n
                                if success
                                    then Indent(value)
                                    else failwith "-i accepts just an integer value as parameter"                          
                          | None        ->
                                    get "co" "The co (code open) parameter is mandatory, if no -indent specified",
                                    get "cc" "The cc (code close) parameter is mandatory")
    inputFile, outputFile, {
        startNarrative  = no
        endNarrative    = nc
        codeSymbols     = codeSymbs

parseCommandLine := parseArgs >> paramsToInputs

2.4 Main method

We can then write main as the only side effect function in the program. Here is where the IO monad would live …

let banner  = "LLite : language friendly literate programming\n"

let myMain args =
        printfn "%s" banner

        let inputFile, outputFile, options = !parseCommandLine args
        let input       = System.IO.File.ReadAllText inputFile
        let output      = !translate options input
        System.IO.File.WriteAllText (outputFile, output)
    | e ->
        printfn "%s" "Failure"
        printfn "%s" e.Message 
        printfn "%s" usage
#if DEBUG 
        printfn "\nDetailed Error Below:\n%A" e

3 An aside: forward declaring functions in F#

3.1 A simple solution

You can achieve something somehow similar to forward declaration by the ’declare ’dirty trick used in this program. Whenever you want to do a forward declaration of a function , or variable, you can type:

let testDeclare() =

    let add = declare float>

    let ``function where I want to use add without having defined it`` nums = nums |> !add

This creates a ref to a function from float to float. It looks a bit like an Haskell type declaration. You can then use such function as if it were actually define and delay his definition to a later point in time when you are ready to explain it.

When you are ready to talk about it, you can then define it with:

    add := fun x -> x + 1.

The syntax is not too bad. You get that often-sought Haskell like explicit type declaration and you can regex the codebase to create an index at the end of the program (maybe).

But is it too slow? After all, there is one more indirection call for each function call.

Let’s test it: enable #time in F# interactive and execute timeNormalF and timeIndirectF varying sleepTime and howManyIter until you convince yourself that it is ok (or not).

    let sleepTime   = 50
    let howManyIter = 100
    let normalF x   = System.Threading.Thread.Sleep sleepTime
    let indirectF   = declare unit>
    indirectF      := fun x -> System.Threading.Thread.Sleep sleepTime

    let timeNormalF     = [1..howManyIter] |> List.iter normalF
    let timeIndirectF   = [1..howManyIter] |> List.iter !indirectF

3.2 A correct solution (but ugly)

Unfortunately, there is a big problem with all of the above: it doesn’t work with generic functions and curried function invocations. The code below works in all cases, but it is ugly for the user to use. In this program I’ve used the beautiful, but incorrect, syntax.

type Literate() =
    static member Declare  (ref : obj ref) (x : 'a) : 'b =
        unbox <| (unbox obj> !ref) x
    static member Define (func : 'a -> 'b) (ref : obj ref) (f : 'a -> 'b) =
        ref := box (unbox >> f >> box)

// Declaration    
let rec id (x : 'a) : 'a = Literate.Declare idImpl x
and idImpl = ref null

// Usage
let f () = id 100 + id 200

// Definition
Literate.Define id idImpl (fun x -> x)

3.3 The traditional way

The traditional way of doing something like this is by passing the function as a parameter in a manner similar to the below.

// Declaration and usage intermingled
let calculate' (add: int -> int -> int) x y = add x y * add x y

// Definition
let add x y = x + y

let calculate = calculate' add

To my way of seeing, this is the most cumbersome solution and conceptually dishonest because you are not parametrizing your function for technical reasons, but just for the sake of explaining things. In a way, you are changing the signature of your functions for the sake of writing a book. That can’t be right …

Writing functional code in C++ V – Miscellaneous and conclusions

Just a couple of trivialities and my parting thoughts.

Nested functions

If your language has lambdas, you don’t need nested functions support because you can implement them using it.

I am a heavy user of nested functions, but I’m of two minds about it. On one side, I like that they sit close to where they are used, avoiding going outside the main function body to understand them. I also like that you don’t need to pass a lot of parameters to them, as they capture the function locals. On the other side, they end up creating the impression that your functions are very long and so, in my eyes, they occasionally reduce readability. The IDE helps you out there (at least VS 11) by allowing you to collapse the lambdas.

An example of trivial case is below:

    int x = 3;
    auto sumX = [&] (int y) { return x + y;};

    BOOST_CHECK_EQUAL(sumX(2), 3+2);

And here is a more realistic one (not written in a functional style), where readability is reduced by the three nested functions (among many other things):

bool condor(
            boost::gregorian::date now,  // date to evaluate 
            double spot,                 // spot price underlying
            double v,                    // ATM volatility
            double r,                    // risk free rate
            double step,                 // % of spot price to keep as distance between wings
            int minCallShortDist,        // min distance from the short call strike in steps
            int minPutShortDist,         // min distance from the short put strike in steps
            int minDays,                 // min number of days to expiry
            int maxDays,                 // max number of days to expiry
            double maxDelta,             // max acceptable delta value for shorts in steps
            double minPremium,           // min accepted premium as % of step
            Condor& ret                  // return value
    // convert params to dollar signs
    auto stepPr            = round(step * spot);
    auto toUSD             = [stepPr] (double x) { return round(stepPr * x);};
    auto minCpr            = toUSD( minCallShortDist );
    auto minPpr            = toUSD( minPutShortDist );
    auto premiumPr         = toUSD( minPremium );

    // calc strike values for short legs
    auto atm               = round(spot / stepPr) * (long) stepPr;
    auto callShort         = atm + minCpr;
    auto putShort          = atm - minPpr;
    auto addDays           = [](boost::gregorian::date d, int dys) -> boost::gregorian::date {
        using namespace boost::gregorian;

        auto toAdd         = days(dys);
        auto dTarget       = d + toAdd;

        return  dTarget;

    // calc min & max allowed expiry dates
    auto minDate           = addDays(now, minDays);
    auto maxDate           = addDays(now, maxDays);
    auto expiry            = calcExpiry(now, 0);

    // find first good expiry
    while(expiry < minDate)
        expiry          = calcExpiry(expiry, +1);

    Greeks g;
    auto scholes           = [=, &g] (double strike, int days, bool CorP) {
        return blackScholesEuro(spot, strike, days, CorP, v, r, g, true);

    // find a condor that works at this expiry
    auto findCondor        = [=, &g, &ret] (int days) -> bool {
        ret.shortCallStrike                = callShort;
        ret.shortPutStrike                 = putShort;
        auto shCallPremium                 = 0.0;
        auto shPutPremium                  = 0.0;

        // find short call strike price < maxDelta
        while(true) {
            shCallPremium                  = scholes(ret.shortCallStrike, days, true);
            if( <= maxDelta)
                ret.shortCallStrike        += stepPr;

        // find short put strike price < maxDelta
        while(true) {
            shPutPremium                   = scholes(ret.shortPutStrike, days, false);
            if( (- <= maxDelta)
                ret.shortPutStrike         -= stepPr;

        // check premium is adeguate
        ret.longCallStrike                = ret.shortCallStrike + stepPr;
        ret.longPutStrike                 = ret.shortPutStrike  - stepPr;
        auto lgCall                       = scholes(ret.longCallStrike, days, true);
        auto lgPut                        = scholes(ret.longPutStrike,  days, false);
        ret.netPremium                    = shCallPremium + shPutPremium - lgCall - lgPut;

        return ret.netPremium > premiumPr;

    // increases the expiry until it finds a condor or the expiry is too far out
    while (expiry < maxDate) {
        auto days        = (expiry - now).days();
        if(findCondor(days)) {
            ret.year     = expiry.year();
            ret.month    = expiry.month();
            return true;
        expiry           = calcExpiry(expiry, +1);

    return false;

That is quite a mouthful, isn’t it? But really the function is not that long. It is all these nested functions that makes it seems so.


Tuples are another feature toward which I have mixed feelings. They are really useful to return multiple results from a function and for rapid prototyping of concepts. But they are easy to abuse. Creating Records is almost always a better, safer way to craft your code, albeit more verbose.

The standard C++ has an excellent tuple library that makes working with them almost as easy as in mainstream functional languages. The below function shows creating them, getting their value, passing them to functions and deconstructing them:

    auto t = make_tuple("bob", "john", 3, 2.3);
    BOOST_CHECK_EQUAL(get<0>(t), "bob");
    BOOST_CHECK_EQUAL(get<2>(t), 3);

    // yep, compiler error
    //auto k = get<10>(t);

    auto t2(t);
    BOOST_CHECK(t2 == t);

    // passing as argument, returning it
    auto f = [] (tuple<string, string, int, double> t) { return t;};
    auto t1 = f(t);
    BOOST_CHECK(t == t1);

    // automatic deconstruction
    string s1; string s2; int i; double d;
    std::tie(s1, s2, i, d) = f(t);
    BOOST_CHECK_EQUAL(s1, "bob");

    // partial reconstruction
    string s11;
    std::tie(s11, ignore, ignore, ignore) = f(t);
    BOOST_CHECK_EQUAL(s11, "bob");


I’m sure there are some fundamental functional features that I haven’t touched on (i.e. currying, more powerful function composition, etc…).  Despite that, I think we have enough material here to start drawing some conclusions. To start with, where is C++ deficient compared to mainstream functional languages for the sake of writing functional code ?

  • Compiler errors: if you make a well natured error, you often get back from the compiler a long page of rubbish if you are using templates (and you are). You put your goggles on and start swimming through it to find the nugget of gold that is the root cause of your problem. 
    I can hear you C++ gurus screaming: come on, come on, it’s not that bad. After a while you get used to it. You become proficient in rubbish swimming. That makes me think of the story of that guy who lived in a strange neighbourhood where, as soon as you open the door of you house, someone kicks you in the nuts. His wife suggested that they moved to a better part of town, but the man replied: “no worries, I got used to it”.
  • Syntax oddity: we did our best to make the gods of syntax happy in this series of articles, but we are still far removed from Haskell, F#, etc beauty…
  • Advanced features: if you go beyond the basics of functional languages, there is a plethora of interesting features that just make your life so much easier and bring your code to an higher level of abstraction. Things like monads (i.e. f# async workflow), type providers, Haskell’s lazy execution, Haskell’s super strong type system, Closure’s transactional memory etc… Boost is trying to bring many of these things to C++, but there is still a big gap to be filled.
  • Temptation to cheat: you can cheat in many functional programming languages (i.e. having mutable variables), but nowhere the temptation is as strong as in C++. You need an iron discipline not to do it. The fact that sometimes cheating is the right thing to do makes it even worse.

Overall, am I going to use C++ now as my main programming language? Not really. I don’t have one of those. My general view is to always use the best language for the job at hand (yes, I even use Javascript for web pages).

C++ has some unique characteristics (i.e. speed, cross compilation & simple C interfacing). I’m going to use it whenever I need these characteristics, but now I’ll be a happier user knowing that I can write decent functional code in it.

Writing functional code in C++ IV – Algebraic datatypes

And here comes the guilt bit. I have the strong suspicion (but not certainty) that what I am doing here can be done with templates, but didn’t take the time to do it. With that out of the way, let’s go.

Code for this post is here. Thanks to Steve Bower and Andy Sawyer for reviewing it.

Algebraic datatypes (discriminated unions in F#) are a powerful concept in functional programming. They are the main way to represent type variation in your program. Very roughly, where object orientation uses derivation, functional programming uses algebraic datatypes. An entire book could be written on the theory of this, but the goal of this post is to see how we can map them to C++ without loosing C++ness.

When talking about this with C++ programmers, they always point me to boost variant. That doesn’t do it for me for several reasons.

First of all, boost variants represent one of a fixed collection of types. Algebraic datatypes represent one of a fixed collection of named types. That means that a simple thing like the code below cannot be represented as variant:

type LivingEntity =
| Person of string  // name
| Dog of string     // name

Ok, ok maybe you could represent it by ‘typifing’ things using boost strong typedef, but things get ugly syntactically. Moreover, a lot of time the name is all you care about …

type Switch = On | Off

Are we going to strong typedef for such a simple thing? Oh boy. Even accepting the syntactic weight of all this, there are other issues in play. Discriminated unions are used extensively in functional programs. So you want a nice syntax to work with them Something like the below F# syntax:

let print living =
    match living with
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

Which could be made even sweeter by using the ‘function’ keyword as below:

let print = function
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

In boost variant, you either use the get<type> functions or you write a visitor function. In the first case you are probably going to write a chain of ‘if’ statements or a ‘switch’ statement. Both are confusing and come with plenty of syntactic weight. I don’t really want to write a visitor like the one below for each ‘match’ in my code. The magic is gone.

class times_two_visitor
    : public boost::static_visitor<>

    void operator()(int & i) const
        i *= 2;

    void operator()(std::string & str) const
        str += str;


Ok, so boost variant doesn’t really work for this. Remember that our overarching goal was to stay close to C++. The language itself has something that comes pretty close to what we want in the form of a union, or better a tagged union. Again, the types are not named, but maybe we can work that in.

It turns out that Jared here did all the hard work. The general idea is to use macros to hide the construction of a tagged union with methods to test the type and return the contained value.

For example this code:

    DU_VALUE(Person,    string)
    DU_VALUE(Dog,       string)

Becomes something like:

struct LivingEntity {
        LivingEntity() {}
        unsigned int m_kind;
        static LivingEntity Person(const string& value) {
            LivingEntity unionValue;
            unionValue.m_kind = 19;
            unionValue.m_Person = value;
            return unionValue; }

        bool IsPerson() const {
            return m_kind == 19;
        const string& GetPerson() const {
            (void)( (!!(m_kind == 19)) || (_wassert(L"m_kind == __LINE__", L"c:discriminated_union.cpp", 19), 0) );
            return m_Person; }
        string GetPerson() {
            (void)( (!!(m_kind == 19)) || (_wassert(L"m_kind == __LINE__", L"c:discriminated_union.cpp", 19), 0) );
            return m_Person; }
string m_Person;

You can see the outline of a tagged union (i.e. m_kind) with a constructor for each type (i.e. Person) and methods to test for at type and return its value. You can also see the storage for the value (i.e. m_Person).

The only thing in DU_DECLARE that is different from Jared’s solution is the typedef below that allows not repeating the LivingEntity name in each DU_VALUE.

#define DU_DECLARE(name)                        \
    struct name {                               \
    private:                                    \
        typedef name unionName;                 \
        name() {}                               \
        unsigned int m_kind;                    \
#define DU_VALUE(entryName, entryType)                                                                      \
        static unionName entryName(const entryType& value) {                                                \
            unionName unionValue;                                                                           \
            unionValue.m_kind = __LINE__;                                                                   \
            unionValue.m_##entryName = value;                                                               \
            return unionValue;  }                                                                           \
        bool Is##entryName() const { return m_kind == __LINE__;}                                            \
        const entryType& Get##entryName() const { assert(m_kind == __LINE__); return m_##entryName; }       \
        entryType Get##entryName() { assert(m_kind == __LINE__); return m_##entryName; }                    \
    private:                                                                                                \
        entryType m_##entryName;                                                                            \

With all of that at your disposal it becomes easy to write:

    auto entity = LivingEntity::Dog("Bob");

        DU_CASE(Dog,   BOOST_CHECK_EQUAL(value, "Bob");)

There are some beautiful things about this. First of all, the construction of any of such types is super simple. You even get intellisense!

Moreover the ‘value’ variable contains whatever was passed in the constructor for the object. So this is semantically equivalent, if not syntactically, to the match statement in F#.

Obviously the code part is not limited to a single instruction:

            cout << "I should be here";
            BOOST_CHECK_EQUAL(value, "Bob");

And for those of you addicted to braces, I venture:

            cout << "I should be here";
            BOOST_CHECK_EQUAL(value, "Bob");

They all work with the same macro definition. They expand to something along the line of:

    if(false) {}
        else if(entity.IsDog()) {
            auto value = entity.GetDog();
            BOOST_CHECK_EQUAL(value, "Bob");
        else if(entity.IsPerson()) {
            auto value = entity.GetPerson();
        else {
            throw match_exception();

I’ve not reached the pinnacle of macro naming mastering with this one. Making them lowercase and risking a bit more on the conflict side would make the syntax much more palatable. I call it, as it is, not too bad.

The last ‘else’ clause assures you then if you add a new type to the discriminated union and forget to update one of the ‘MATCH’ clauses at least you get a run time error. That is not good. Functional languages would give you a compile time error, which is much better. Maybe with judicious use of templates you can bring the error at compile time.

The macros are trivial:

class match_exception: std::exception {};

#define DU_MATCH(unionName) { auto du_match_var = unionName; if(false) {}

#define DU_CASE_TAG(entry, ...)                        \
    else if(du_match_var.Is##entry()) {                \
        __VA_ARGS__                                    \

#define DU_CASE(entry, ...)                            \
    else if(du_match_var.Is##entry()) {                \
        auto value = du_match_var.Get##entry();        \
        __VA_ARGS__                                    \

#define DU_DEFAULT(...)                                \
    else if(true) { __VA_ARGS__}

#define DU_MATCH_END else {throw new match_exception();} }

Let’s now go back to our initial goal and see how far off we are. We were trying to do the following:

type LivingEntity =
| Person of string  
| Dog of string  

let print = function
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

And here is what we ended up with:

    DU_VALUE(Person,    string)
    DU_VALUE(Dog,        string)

auto print(LivingEntity en) -> void {
        DU_CASE(Dog,    cout << "I'm a dog named " << value;)
        DU_CASE(Person, cout << "I'm a per named " << value;) 

In our Switch case:

type Switch = On | Off

You get the good looking :


And along the way we lost compile time type safety in the very common case of adding new types to the discriminated union. That’s bad.

Also some of you would strongly dislike the (ab)use of macros. As for me, it looks workable.

Writing functional code in C++ III – Performance of different allocation schemes

Now we know how to represent records and we know how to operate on them using a nice F# like syntax. But how do we store our record in a data structure in the first place?

Code for this post is here. Thanks to Andy Sawyer and Steve Bower for reviewing this.

As it is often the case, C++ gives you many options that are not available in standard functional languages. A mixed blessing of some sort.

  1. You can store them by value or by pointer
  2. If you store them by pointer, you can use normal pointers or smart pointers
  3. If you store them by smart pointer, you can use different ones

Well, storing them by value is the simplest option, but it has two shortcomings:

  1. You pay the price of a full record copy whenever the object need to be copied (i.e. when you add/remove records to/from a vector and whenever the vector needs to resize itself)
  2. If you store a record in two containers, you get two different copies of it. If you modify one, the other won’t be modified

The second issue is not a problem in our scenario as records, by definition, are immutable and structurally equivalent (aka equality for them doesn’t mean pointer equality). The first issue is more tricky. Our records get a copy constructor by default which does a ‘memcpy’ of the object. That should be fast. But fast enough?

To answer this question (and more), I embarked in a little performance test that encompasses most ways to store a pod type in an vector like data structure. For all my test I used this struct:

struct Record {
    int Id;
    char k1[2];
    char k2[2];
    char k3[2];
    char k4[2];
    char k5[2];
    char k6[2];
    char mem[bigBlock];

    void Lock() {}
    void Unlock() {}

By recompiling with different values for ‘bigBlock’ I can test what happens when the record size grows. In all tests a record is initialized with this function:

void record_init(Record& r, int i) {

    r.Id = i;
    strcpy(r.k1, "0");
    strcpy(r.k2, "0");
    strcpy(r.k3, "0");
    strcpy(r.k4, "0");
    strcpy(r.k5, "0");
    strcpy(r.k6, "0");
    memset(r.mem, '-', bigBlock);


The tests are specific to the scenario I care about: performing functional-like operations on a Range:

struct to_int {
    typedef int result_type;
    int operator() (const Record& r) const {
        return r.Id;

function<bool (const Record&)> filter_f = [](const Record& r) {return r.Id < filterNo;};

template <class Range>
int accumulate_filter(const Range& r) {
    return boost::accumulate(
        r | filtered(filter_f) | transformed(to_int()),

The usage of a function and  a functor is a bit odd. I don’t recall why I did it that way, but it doesn’t matter as it is the same for each test. What the test does is just filtering a bunch of record, transforming them (map) to ints and sum these ints.

How many repetitions of each test are used, how big is the record, how many records the Range contains is specified in these constants:

const int repetitions = 1000;
const int bigBlock = 1000;
const int howMany = 1000;

Time is kept using this clock that wraps boost::chrono:

typedef boost::chrono::process_cpu_clock the_clock;

struct timer {
    timer(): clock_(the_clock::now()) {}

    the_clock::times elapsed() {
        return (the_clock::now() - clock_).count();

    the_clock::time_point clock_;

I have tested the following configurations. Adding records by value:

int normal() {

    vector<Record> v;
    for (int i = 0; i < howMany; ++i) {
        Record r;
        record_init(r, i);

    return accumulate_filter(v);  

I then tested adding records using a pod_vector. This is a data structure described here and in “Imperfect C++”. It is a vector that uses as an auto_buffer as the underlying storage. An auto_buffer is a class that uses stack memory if it needs more than a certain number of bytes specified at compile time, otherwise it uses heap memory. It works well if you know at compile time that most allocations for something take at most N bytes, but some might need more. This situation is surprisingly frequent. Unfortunately, your objects need to be POD to use the pod_vector.

I tested it in the case where the stack space is big enough (podVector<howMany*2>) and when it is not (podVector<howMany/10>).

int podVector() {

    pod_vector<Record, size> v;
    for (int i = 0; i < howMany; ++i) {
        Record r;
        record_init(r, i);

    return accumulate_filter(v);  

I then tested just allocating the memory, without freeing it and using boost::pool in it’s ‘local’ form, which means that memory is freed when it goes out of scope (stack-like). The main drawback of the latter is that you cannot return the container/range.

int boostallocator(WhichOne which) {

    boost::pool<> p(sizeof(Record));
    vector<Record*> v;
    Record* r;
    for (int i = 0; i < howMany; ++i) {
        if(which == Boost)
            r = (Record*)p.malloc(); // memory freed at function exit
            r = new Record; // oops, memory leak

        record_init(*r, i);

    return accumulate_filter(v | indirected);

Indirected is needed because we are not talking about pointers. I also tested various variations of shared pointers. First a normal shared_ptr, then a shared_ptr initialized with the boost::singleton_pool and finally a pointer initialized with make_shared.

int pointers(WhichOne which) {

    vector<shared_ptr<Record>> v;
    for (int i = 0; i < howMany; ++i) {
        shared_ptr<Record> r;
        if(which == Normal)
            r = shared_ptr<Record>(new Record);
        else if(which == Boost)
            r = shared_ptr<Record>((Record*)record_pool::malloc(), [](void* p) {record_pool::free(p);});
        else if(which == Make)
            r = make_shared<Record>();
        else throw "This kind of pointer doesn't exist";

        record_init(*r, i);

    return accumulate_filter(v | indirected);

Finally, I used a Loki smart pointer. This is a very elegantly designed smart pointers from  “Modern C++ design”. You pass as template parameters the policies you want your smart pointer to have (aka how it should behave). I tested it like so:

typedef Loki::SmartPtr<Record,
                 LOKI_DEFAULT_CONSTNESS> RecordPtr;

And using the following, slightly more convoluted code:

int lokipointers(WhichOne) {
    vector<RecordPtr> v;
    for (int i = 0; i < howMany; ++i) {
        RecordPtr r = RecordPtr(new Record());
        record_init(*r, i);

    auto ret = accumulate_filter(v | transformed(RecordPtr::GetPointer) | indirected);
    return ret;

The result of my tests are in this table and at this link. I used VS10 and gcc 4.6.2 on a Windows 7 box. The first number in the (S, N) pair at the top of each column represents  the size of the record, the second one represents the number of objects in the vector. To obtain reasonable numbers, the tests have been run with a different number of iteration for each pair, which means that you can compare the results vertically, but not horizontally.


There are a lot of interesting things in this table. Here are my takeaways. They are obviously specific to this single scenario. You might have different scenarios or different takeaways:

  • Stack allocation is not too bad for small objects, especially for gcc
  • Pod_vector is a good overall performer (even if you don’t get the size right), it is unfortunate that just works with POD types
  • GCC seems to be lagging on the shared_ptr front. Also msvc implementation of the make_shared optimization gives a visible advantage
  • There is not much advantage in using the shared pointer with the boost pooled allocator
  • If you can use the boost local pool allocator, you should. It is faster than stack allocation (in this scenario). Remember that the memory is reclaimed when you exit the function …

So here you have it. A small detour on the performance of different schemes for allocating pod types. As it is often the case in C++, it depends …

BTW: Andy Sawyer told me about his rough algorithm to decide which STL container to use. I reproduce it here:


A decision tree for selecting a sequence container:

– I’m in a rush and don’t want to read the rest: use std::deque

– Do we know ahead of time exactly how many elements will be needed (and will they all fit on our stack!) – If so, use std::array.

– Do we need constant-time random access? (Note that we often *think* we do, but actually don’t – YAGNI) If so, then we can eliminate std::list/std::forward_list as candidates.

– Do we need bidirectional iteration? (Again, we need this less often that we think we do). If so, that eliminates std::forward_list

– Will there be a large number of in-the-middle insertion/removal?  (and assuming we haven’t already eliminated them) then std::list/std::forward_list (especially when the contained type is expensive to move/copy).  In extreme cases, this may be a strong enough requirement to override other constraints (e.g. the need for random access). On the other hand, it may be viable to reduce the cost of move/copy of large contained types by using containers of (smart) pointers.

– Do we need the contents as a contiguous array? use std::vector (and call reserve  if we can) – sometimes, anyway; I’ve seen cases where it’s faster to build my collection as a std::deque, then transform to std::vector later on.)

– Use std::deque.

Or, to put it another way, “use std::deque  unless there’s a good reason not to”.  That’s probably an overly-strong statement, but it’s a reasonable starting point.

Naturally, every case will be different – and no set of ‘rules of thumb’ is going to work every time (this is one of the things which distinguish rules of thumb from laws of physics). Commonly, we don’t know ahead of time which of those criteria is going to have the most significant impact on what we’re doing; when I find myself in that situation, I’ll use std::deque; once I have a firmer idea of the criteria, I’ll go back and revisit that choice (which – quite often – is as simple as changing a typedef). And of course – as with all ‘optimisations’ – measure, measure and measure again!


Next time, we’ll go back to more traditional functional programming topics.

Writing functional code in C++ II – Function composition

Function composition is at the core of functional programming. You start by being very confident that certain very small functions are correct, you compose them in well known ways and you end up being very confident that your final program is correct.

You are very confident that the initial functions are correct because they are very small and side effect free. You are very confident that your program is correct because the means of composition are well known and generate functions that are themselves side effect free.

Such ‘almost primitive’ functions operate on data structures. Each functional language has its own set of data structures and ‘almost primitive’ functions. Probably the most famous ones are contained in the haskell prelude, but F# has similar ones. LINQ in C# is just such a set.

In this article I use the term functional composition in a broad sense as ‘putting functions together’, not in the more technical sense of f(g(x)): dot operator in Haskell or ‘>>’ operator in F#.

Code for this post is here. Thanks to Steve Bower and Andy Sawyer for review and comments.

Here is an example. The F# code below filters in the odd numbers from an array and doubles them. You have two logical operations, filtering and doubling.

let v = [|1;2;3;4;5;6;7;8;9;0|]

let transformed =
    |> Seq.filter(fun i -> i % 2 = 0)
    |> ((*) 2)

How would the code look in C++? There are many ways. One is the ‘big fat for loop’, or in C++ 11, for each:

    const int tmp[] = {1,2,3,4,5,6,7,8,9,0};
    const vector<int> v(&tmp[0], &tmp[10]);

    vector<int> filtered;

    for(auto i: v)
        if(i % 2 == 0)
            filtered.push_back(i * 2);

This looks simple enough, apart from the initialization syntax that gets better with C++ 11. But it’s simplicity is misleading. We have now lost the information that our computation is composed by a filter and a map. It’s all intermingled. Obviously, the info is still there in some sense, but you need to read each line in your head and figure out the structure on your own. It is not readily apparent to the reader.

Obviously, things get much worse in a large program composed of several loops where each one does not trivial stuff. As for me, every time I indulge into the ‘big fat for loop pattern’, I end up regretting it either immediately or after some time when I have to make changes to the code. So I try to avoid it.

Wait a minute, you might say, what about STL algorithms? Aren’t you supposed to use those?

The syntax to call them is not very compositional. You cannot take the result of an algorithm and pass it to another easily. It is difficult to chain them that way. Everything takes two iterators, even if 99% of the times you are just using begin and end. Moreover, you have to create all sort of temporary data structures for the sake of not modifying your original one. Once you have done all of that, the syntactic weight of your code overwhelm the simplicity of what you are trying to do.

For example, here is the same code using ‘transform’ for the ‘map’ part of the algorithm:

    vector<int> filtered;
    copy_if(begin(v), end(v), back_inserter(filtered), [](int x) { return x % 2 == 0;});
    vector<int> transformed;
    transform(begin(filtered), end(filtered), back_inserter(transformed), [](int x) { return x * 2;});

I’m not arguing that STL is a bad library, I think it is a great one. It’s just that its syntax is not very compositional. And that breaks the magic.

Luckily, help is on the way in the form of the Range library and OvenToBoost. These libraries ‘pack’ the two iterators in a single structure and override the ‘|’ operator to come up with something as below:

    auto result =
        | filtered(   [] (int i) { return i % 2 == 0;})
        | transformed([] (int i) { return i * 2;});

If you use Boost lambdas (not sure I’d do that), the syntax gets even better:

    auto result =
        | filtered(   _1 % 2 == 0)
        | transformed(_1 * 2);

Ok, so we regained compositionality and clarity, but what price are we paying? Apart from the dependency from Boost::Range and OvenToBoost, you might think that this code would be slower than a normal for loop. And you would be right. But maybe not by as much as you would think.

The code for my test is here. I’m testing the perf difference between the following code engineered to put in a bad light the most pleasing syntax. A for loop:

        int sum = 0;
        for(vector<int>::const_iterator it = v.begin(); it != v.end(); ++it)
            if(*it < 50)
                sum += *it * 2;
        return sum;

A Range based code using language lambdas (transformedF is a variation of this):

        auto lessThan50 = v | filtered(    [](int i) { return i < 50;})
                            | transformedF([] (int i) { return i * 2;});

        return boost::accumulate (lessThan50, 0);

A Range based code using two functors:

        auto lessThan50 = v | filtered(Filterer())
                            | transformed(Doubler());

        return boost::accumulate (lessThan50, 0);


struct Doubler {
    typedef int result_type;
    int operator() (int i) const { return i * 2;}

struct Filterer {
    typedef int result_type;
    bool operator() (int i) const { return i < 50;}

a Range based code using Boost lambdas:

        auto lessThan50 = v | filtered(_1 < 50)
                            | transformed(_1 * 2);

        return boost::accumulate (lessThan50, 0);

And finally, some code using the STL for_each function:

        int sum = 0;
        std::for_each( v.cbegin(), v.cend(),
            [&](int i){
                if( i < 50 ) sum += i*2; 
        return sum;

Notice that I need to run the test an humongous number of times (10000000) on a 100 integers array to start seeing some consistency in results. 
And the differences that I see (in nanosecs below) are not huge:

1>                         Language lambda: {1775645516;1778411400;0}
1>                          Functor lambda: {1433377701;1435209200;0}
1>                            Boost lambda: {1327829199;1326008500;0}
1>                                For loop: {1338268336;1341608600;0}
1>                             STL foreach: {1338268336;1341608600;0}

True to be told, by changing the code a bit these numbers vary and, in some configurations, the Range code takes twice as long.

But given how many repetitions of the test I need to run to notice the difference, I’d say that in most scenarios you should be ok.

The ‘|>’ operator over collections is the kind of functional composition, broadly defined, that I use the most, but you certainly can use it over other things as well. You’d like to write something like the below:

    BOOST_CHECK_EQUAL(pipe("bobby", strlen, boost::lexical_cast<string, int>), "5");

And you can, once you define the following:

    template< class A, class F1, class F2 >
    inline auto pipe(const A & a, const F1 & f1, const F2 & f2)
        -> decltype(REMOVE_REF_BUG(f2(f1(a)))) {
        return f2(f1(a));

Where REMOVE_REF_BUG is a workaround for a bug in the decltype implementation under msvc (it should be fixed in VS 11):

    #ifdef _MSC_VER
        #define REMOVE_REF_BUG(...) remove_reference<decltype(__VA_ARGS__)>::type()
        #define REMOVE_REF_BUG(...) __VA_ARGS__

I didn’t do the work of creating ‘>>’ , ‘<<’ and ‘<|’ F# operators or wrapping the whole lot with better syntax (i.e. operator overloading?), but it certainly can be done.

Now that we have a good syntax for records and a good syntax for operating over collections, the next instalment will put them together and try to answer the question: how should I create collections of records and operate over them?

Writing functional code in C++ – Records

This is the first of a series of posts about writing functional code in C++.  My goal is different from FC++, which is a full fledged ‘environment’ to write functional code. Instead, I want to experiment with some of the new C++ 11 language features and see if one can build reasonably looking functional code and stay pretty close to the language. The idea is to judiciously use macros and external libraries to build a thin layer on top of the language that doesn’t change the performance characteristics of it (aka it doesn’t slow it down) and integrates fine with existing C++ code.

Think of it as an attempt to answer the question: is there a way to write C++ code in a functional style without loosing its ‘C++sness’? We won’t attempt to be as type safe or syntactically pleasing as Haskell or F# , but we’ll attempt to stay as fast and flexible as C++ and make it functional enough to be interesting.

Thanks to Steve Bower and Ganesh Sittampalam for reviewing and to Andy Sawyer for giving me so much feedback that this post can be considered a co-authorship. Code for this post is here.

Let’s first talk about the data types that you typically find in a functional language. Let’s start with Records. They are not part of the functional model per se. You can do without them and just use algebraic data types and tuples. But they are damn convenient, and  most functional languages (i.e. Haskell, Clojure, etc…) have them now. We start from them because they map naturally to C++. Records are just like structs, but immutable and having structural equality.

In F# your vanilla record looks like this:

type Person = {
    Name: string
    Id: int

Nice and simple. With C++, you become more verbose. A first attempt would be:

    struct Person {
        const string Name;
        const int Salary;

Which looks nice and easy, but doesn’t quite work because more often than not you need to be able to compare two records for structural equality (which means the value of their fields, not the pointer in memory, defines equality). Records in functional languages automatically support that. But the syntax gets on the ugly side:

    struct Person {
        const string Name;
        const int Salary;

        bool operator==(const Person& other) const { return Salary == other.Salary && Name == other.Name;}
        bool operator!=(const Person& other) const { return !(*this == other);}

We’ll see how to simplify the syntax later. The syntax on the creation side is not too bad:

Person p = {"Bobby", 2};

Let’s call the above representation, the obvious one. Let’s consider two variations on this scheme. The first one is useful if you want to make your records interoperable with C or with other C++ compilers.

A full discussion of how to achieve these goals would be very long. It will go about discussing what POD types are and how their definition got more precise in C++ 11.

You can look at my experimentations on pod, standard layout and trivially constructible types here. My summary is pretty simple, if you want to achieve all the goals above, you got to use C structs that contain C-compatible types. No const, strings or STL libraries allowed.

The above class would then become:

    struct Person {
        char Name[20];
        int Salary;

        bool operator==(const Person& other) const { return Salary == other.Salary && !strcmp(Name,other.Name);}
        bool operator!=(const Person& other) const { return !(*this == other);}

Obviously, not being able to use strings or STL collections in your record is a big limitation, but in certain cases you might be able to live with it.  Let’s call this representation, the pod one.

You would think you can make the syntax better by doing something like the below:

    struct _Person {
        int id;
        char name[20];


Where pod_equality is defined as below:

        #define pod_equality(Record)                                                                 \
               bool operator==(const Record& other) const {                                          \
                      static_assert(std::is_trivial<Record>::value, "Not trivially copyable");       \
                      return memcmp(this, &other, sizeof(Record)) == 0;}                             \
               bool operator!=(const Record& other) const { return !(*this == other);}

But you would be wrong (as I was for a while), as comparing memory values doesn’t work in this case because of the padding that the compiler can insert between fields. Such padding can contain random value (whatever was on the stack, for example) which would make the equality return false for structurally equal objects. Also this scheme fails for floating point types (i.e. NaN and signed zeros).

An alternative representation for records, which nicely separates constness from the structure of your record is below. It does have some some advantages that we’ll look at, but in its raw form it is yet more syntax:

namespace Mutable {
    struct Person {
        string Name;
        int Salary;

        bool operator==(const Person& other) const { return Salary == other.Salary && Name == other.Name;}
        bool operator!=(const Person& other) const { return !(*this == other);}

typedef const Mutable::Person Person;

Let’s call this representation, the immutable one. Let’s give these guys some usage and talk about their trade-offs. If you want to write a function that increase someone salary, you would write it like this:

template<class T>
T rise_salary1(const T& p) {
    T ret = { p.Name, p.Salary + 1000 };
    return ret;

This looks nice and clean, unless your record has a lot of fields. Let me tell you, in real application it probably does. For example:

namespace Mutable {

    struct foo {
        int value1;
        int value2;
        int value3; 
        int value4;
        int value5;
        int value7;
        int value6;
        int value8;
        int value9;

typedef const Mutable::foo foo;

foo increment_value7( const foo& f )
     foo tmp = { f.value1, f.value2, f.value3, f.value4, f.value5, f.value6, f.value7+1, f.value8 };
     return tmp;

Thanks to Andy for this example. BTW: did you spot the bug at first sight? What about the other one?

So this syntax is problematic. True to be told, part of the problem is in the sub-optimal initialization syntax in C++. If you could use named parameters, it would be more difficult to introduce bugs, but the syntax would still be ugly. You really need something like the F# syntax:

let r1 = {f = 0.2; k = 3}
let r2 = {r1 with f = 0.1}

Can we do something like that? Well, if we are willing to pass the Mutable one around, we can write this:

template<class T>
T rise_salary2(const T& p) {
    T ret(p);
    ret.Salary += 1000;
    return ret;

Or even this:

template<class T>
T rise_salary3(T p) {
    p.Salary += 1000;
    return p;

But that doesn’t make us happy, does it? The whole point of making things const is that you want them to be const Smile If you pass around mutable ones, who knows what’s going to happen inside the method?

There is a middle ground that might be acceptable, which is to write functions so that their interface takes immutable records, but inside the function they operate on mutable ones. This is not a bad pattern in general, as having mutable versions of your immutable records might come useful for optimizing certain algorithms. Luckily the casting rules of C++ favour the bold, so the below works:

Person rise_salary_m(const Person& p) {
    Mutable::Person ret(p);
    ret.Salary += 1000;
    return ret;        

And doesn’t look too bad either.

Now let’s talk syntax. Defining a record is still a lot of typing (and a lot of reading if you are maintaining the code). F# does it like this:

type Person = {
    Name: string
    Id: int

The best I came up with looks like this:

          string, Name,
          int,    Salary);

And you need a lot of those macros depending on how many fields your record has. You can write this macro to expand to either the Obvious or Immutable representation trivially. It is a bit more complex for the Pod one because of the interesting C++ array declaration syntax with the number of elements after the name of the field.

For the Obvious one it looks like this:

#define RECORD2(n, t1, f1, t2, f2)                                                            \
    struct n {                                                                                \
        const t1 f1;                                                                          \
        const t2 f2;                                                                          \
        bool operator==(const n& other) const { return f1 == other.f1 && f2 == other.f2;}     \
        bool operator!=(const n& other) const { return !(*this == other);}                    \

All the usual concerns about macros apply. Moreover all your fields need to have a meaningful == operator.

To summarize, we have found three different representations of records in C++ and tried to alleviate the syntax burden with some macro magick. We’ll go wild in macro-land when we talk about discriminated unions.

A simple scheme to implement Design by Contract in C++

Recently I got interested in C++ again. The new lambda functions in C++ 11 open up a world of opportunities for C++ programmers. I’ll talk more about how you can write functional code in C++ 11 in upcoming posts. For now let’s look at design by contract.

Design by contract is a development style promoted by  Bertrand Meyer and it is implemented in his own Eiffel programming language. At core, it advocates using preconditions, postconditions and invariants.

An invariant is an assertion that always holds true for a class after the class has been fully constructed and if the code is not executing inside a method. As a user of the class, you always observe the invariant to be true. As an implementer of the class, you can be assured that the invariant is true before a method is entered and you need to make the invariant true again by the time your method exits.

A preconditions is an assertion that needs to hold true at the start of a function, for the postcondition to be true at the end of it. Taken together, invariant, precondition and postcondition define the contract between the implementer and the user of a class.

Code for this post is here and here. Thanks to Andy Sawyer, Steve Bower and Ganesh Sittampalam for reviewing my code and suggesting improvements.

Preconditions are simple and everyone uses them. They are those little if statements that you put at the start of your functions to make sure that the caller has given you the right parameters.

double divide(double x, double y) {
    if(y == 0) throw new exception(“y cannot be 0”);

These little ‘if’ statements don’t really make the precondition stand out. They can be confused with other, unrelated, ‘if’ statements that do completely different semantic things. A more readable alternative is:

double divide(double x, double y) {
    requires(y != 0);

Not an impressive difference, for sure, but kind of nice. The evil macro looks like this:

#ifndef ___PRECOND
#define requires(F) {if((!(F))) throw preexception(__FILE__, __LINE__,"Pre-condition failure: " #F);};
#define requires(F)

Note that the exception maintains information not just about the file and line number of the failure, but also a textual representation of the failed condition. Such things you can do with macro magick.

Postconditions are trickier. In the case of a side-effect free (pure) function, a postcondition asserts something of interest about the return value. In the case of a class, it asserts something of interest about the state of the class before and after the execution of the method.

Let’s start with a pure function. I like to have all my assertion at the start of the function to allow reasoning about it without looking at implementation details. But that poses the problem that the result is available just at the end of the function.  My solution is to enforce this idiom:

double divide(double x, double y) {
    double result;
    requires(y != 0);
    ensures(result < x); // Silly, just to falsify it in tests
return result; }

So you need to declare your result upfront. That is the biggest limitation of the overall solution in my opinion.  If that is acceptable to you, the trick now is how to execute the postcondition test before the method exits. We can do that by storing a lambda and executing it in the destructor:

typedef std::function<bool ()> ___dbcLambda;

class ___post {
    ___post(const char *file, long line, const char *expr, const ___dbcLambda& postF)
        : _f(postF),

        if( !std::uncaught_exception() && !_f() )
            throw postexception(_file,_line,_expr);

    const ___dbcLambda _f;
    const char * const _file;
    const long _line;
    const char * const _expr;

You might think that you shouldn’t throw exceptions in a destructor. That is something I never understood about the RAII pattern in C++. If I choose to use exceptions as my error notification method, how am I supposed to get notified if there is a problem releasing a resource in RAII, other than by throwing an exception in the destructor?

Maybe because of this, the standard has an uncaught_exception() function that allows you to check if an exception has been thrown, so that you don’t throw another one during stack unwinding. If you really don’t like throwing in the destructor, feel free to assert.

You might be worried about performance, but you really shouldn’t as you can disable all these macros in Release.

The macro then creates a ___post class on the stack.

#define ensures(F) \
    int ___UNIQUE_LINE = __LINE__;  \
    auto ___UNIQUE_POST = ___post( __FILE__, __LINE__, "Post-condition failure:" #F, [&](){return (F);});

The UNIQUE stuff is messy business. Part of it is by design and it is used to make sure that each __post variable has a unique name to have multiple ‘ensures’ in a function. The other part is a workaround for this msvc bug. Let me know if you want more details. I suspect there is a better way to do it.

Here is the full enchilada …

#define ___MERGE(a, b) a##b

#define ___POST(a) ___MERGE(___postcond,a)
#define ___UNIQUE_POST ___POST(__LINE__)

#define ___LINE(a) ___MERGE(___line, a)
#define ___UNIQUE_LINE ___LINE(__LINE__)

The case in which a postcondition is used inside a method of a class is even trickier because the postcondition must be able to compare the state of the class at the entrance of the method to the state of the class at its exit. Assuming a Counter object with an Add method and assuming ‘___pre’ captures the state of the counter at the start of the method, you’d like to write something like:

    void Add(int x) {
        ensuresClass(this->c_ == ___pre.c_ + x);

Now, this is tricky. The only way to capture the ‘old’ state in ‘___pre’ is by making a copy of it and store it there. This is what the code below does:

#define ensuresClass(F) \
    auto ___pre(*this); \
    auto ___UNIQUE_POST = ___post( __FILE__, __LINE__, "Post-condition failure: " #F, [&](){return (F);});

More troubling is the possibility that the class doesn’t have a copy constructor. In that case you explicitly need to associate a value with ‘___pre2’ by passing it as the first parameter to the appropriate macro as in the code below:

    void Add(int x) {
        ensuresClass2(this->c_, c_ == ___pre2 + x);

Which is implemented as follows:

#define ensuresClass2(ASS,F) \
    auto ___pre2(ASS); \
    auto ___UNIQUE_POST = ___post( __FILE__, __LINE__, "Post-condition failure: " #ASS " is ___pre2 in " #F, [&](){return (F);});

And I know about the giant ass …

Now for invariants. The user should implement an isValid() method on his class as below:

    bool isValid() { return c_ >= 0;}

Then he should add an ‘invariant()’ call at the start of each method, at the end of each constructor and at the start of each destructor:

    void Add(int x) {
        requires(x < 10);
        ensures(this->c_ == ___pre.c_ + x);

This calls the ‘isValid’ function at the start of the method and at the end of it using the same destructor trick:

#define invariant() \
    if(!(this->isValid())) throw preexception(__FILE__, __LINE__,"Invariant failure"); \
    auto ___UNIQUE_INV = ___post( __FILE__, __LINE__, "Invariant failure", [&](){return this->isValid();});

All the above machinery is not at all equivalent to having such constructs in the language, but it is simple enough and with a decent enough syntax to be interesting.

Now a caveat: I have no idea if any of this works. It does work in my examples and its behaviour seems reasonably safe to me, but I haven’t tried it out on any big codebase and haven’t stressed it enough for me to be confident recommending its usage. So, use it at your own risk, let me know how it goes.