Functional programming in C – Implementation

0.1 Cleanup

Let’s start simple with the cleanup function. First we need the usual barrage of includes. G_BEGIN_DECLS allows the header to be linked in C++.

#ifndef L_UTILS_INCLUDED
#define L_UTILS_INCLUDED

#include "glib.h"

G_BEGIN_DECLS

#include <stdlib.h>
#include <stdio.h>

This feature is GCC specific. It uses __attribute((cleanup(f))) where f is the cleanup function. In this case the cleanup function just frees the memory.

#ifdef __GNUC__

 static inline void __autofree(void *p) {
     void **_p = (void**)p;
     free(*_p);
 }

auto_clean is a building block that you can use to plug in your own cleanup function. In the common case of memory allocation, I created a wrapper macro auto_free to make it even easier.

#define auto_clean(f)   __attribute((cleanup(f)))
#define auto_free       auto_clean(__autofree)

0.2 Lambdas

I took this one from here.

If you think about it, a lambda is just an expression that returns a function. This macro creates a nested function, called fn, inside a statement expression and returns it. Unfortunately these features are gcc specific.

Remember that lambdas are not allocated on the heap, so you have to be careful on how you used them.

#define lambda(return_type, function_body)                                          
  ({                                                                                
    return_type __fn__ function_body                                                
    __fn__;                                                                         
  })

#endif

0.3 Unions

A union type is what you would expect: a struct that contains an unnamed union and a field to specify which type it is. We need the list of types in union_decl to create the kind enum. The usage of __VA_ARGS__ allows to use whatever syntax you want to go into the enum (i.e. specify int values).

Having to specify the the types here is unfortunate as you are going to need to specify it in the union_case macros as well.

I haven’t found another way to do it. If you do, let me know.

#define union_decl(alg, ...)                                                        
typedef struct alg {                                                                
    enum {  __VA_ARGS__ } kind;                                                     
    union {

You specify each type for the union with union_type. That looks pretty good to me.

#define union_type(type, ...)                                                       
    struct type { __VA_ARGS__ } type;

Ideally you shouldn’t need to specify alg here. Perhaps there is a way to avoid doing so.

#define union_end(alg)                                                              
    };} alg;

You can then set the fields on the union type by using the below macro. Notice the usage of the new struct constructor here to allow optional named parameters.

This is a statement, so it cannot go into an expression place. I think I could make it an expression that returns the existing (or a new) union. This is going to be a sceanrio if people are not using gcc statement expressions.

#define union_set(instance, type, ...)                                              
    G_STMT_START {                                                                  
        (instance)->kind     = (type);                                              
        (instance)->type   = (struct type) { __VA_ARGS__ };                         
    } G_STMT_END

This is an utility macro. It is a version of g_assert that you can use in an expression position.

#define g_assert_e(expr) (                                                          
    (G_LIKELY (!expr) ?                                                             
   (void)g_assertion_message_expr (G_LOG_DOMAIN, __FILE__, __LINE__, G_STRFUNC, 
                                             #expr)                             
    : (void) 1) )

And this allows to fill the default case in a match statement implemented as a ternary operator. It prints out a text representation of the expression and returns it.

#define union_fail(...) (g_assert_e(((void)(__VA_ARGS__) , false)), (__VA_ARGS__))

The rest of the code is commented out. It is a macro way to do pattern matching. For me, the ternary operator is simpler, but I left it there in case you want to play with it.

/*
#define union_case_only_s(instance, type, ...)                                      
        G_STMT_START {                                                              
        if((instance)->kind == (type)) {                                            
            G_GNUC_UNUSED struct type* it = &((instance)->type); __VA_ARGS__; }     
        else g_assert_not_reached();                                                
        } G_STMT_END

#define union_case_first_s(alg, instance, type, ...)                                
    G_STMT_START {                                                                  
        alg* private_tmp = (instance);                                              
        if(private_tmp->kind == type) {                                             
            G_GNUC_UNUSED struct type* it = &((private_tmp)->type); __VA_ARGS__; }

#define union_case_s(type, ...)                                                     
        else if(private_tmp->kind == type) {                                        
            G_GNUC_UNUSED struct type* it = &((private_tmp)->type); __VA_ARGS__; }

#define union_case_last_s(type, ...)                                                
        else if(private_tmp->kind == type) {                                        
            G_GNUC_UNUSED struct type* it = &((private_tmp)->type); __VA_ARGS__; }  
            else g_assert_not_reached(); } G_STMT_END

#define union_case_default_s(...)                                                   
        else __VA_ARGS__; } G_STMT_END

// Need to use assert here because g_assert* cannot be used in expressions as it expands to do .. while(0)
#define union_case_only(instance, type, ...)                                        
        ( (instance)->kind == (type) ? (__VA_ARGS__) : (assert(false), __VA_ARGS__) )

#define union_case_first(instance, type, ...)                                       
        ( (instance)->kind == (type) ? (__VA_ARGS__) :

#define union_case(instance, type, ...)                                             
        (instance)->kind == (type) ? (__VA_ARGS__) :

#define union_case_last(instance, type, ...)                                        
        (instance)->kind == (type) ? (__VA_ARGS__) : (assert(false), (__VA_ARGS__)) )
#define union_case_default(...)                                                     
        (__VA_ARGS__) )

*/

G_BEGIN_DECLS

#endif // L_UTILS_INCLUDED

Functional programming in C

This post/program (as I’m writing it in literate style) is a continuation of my previous posts about functional programming in C++. I promise I’m not going to post about doing it in assembly language (I think) ….

I came to like the simplicity of C very much and got interested in how you could write functional code in it.

There is one irritating thing about C as a viable programming language. Microsoft’s compiler support is not good. It just supports ANSI C, not C99 or C11. So, if you want to use more modern idyoms, you got to use gcc or clang. In this post I assume you use gcc. I will point out the gcc specific extensions.

Also, the C standard library is pretty limited, so I decided to use GLib to complement it. I also created some macros to simplify the syntax. I never understood why people like templates and think macros are evil. It takes me all of 5 minutes to do a -E on GCC to debug the result of a macro expansion. With templates, well, that’s different.

So, in summary, this post is about how you can write functional code in C, perhaps with some gcc extensions and certainly with some macro tricks. Let’s call it funkyC (thanks Ian ). I’m going to show how to use it first. Next post I’m going to show how it’s implemented.

0.1 Discriminated unions in C

With a bit of macro magic, you can get a decent looking discriminated union syntax. First we need to include the correct headers. lutils.h is where all the macros are defined.

#include <glib.h>
#include <stdio.h>
#include <assert.h>
#include <stdbool.h>
#include <signal.h>
#include <string.h>

#include "lutils.h"

Then you can declare a discriminated union with the syntax below. It suffers from two problems: repeting the list of possible types in union_decl and repeating the name of the discriminated union in union_end. Perhaps there is a way to avoid that, but I haven’t found it.

The syntax for the union_type call is the same as you would use inside a struct declaration. We’ll see how this works when we look at lutils.h.

union_decl  (Car, Volvo, Fiat, Ferrari)
union_type      (Volvo,     int x; double y;)
union_type      (Fiat,      char* brand, *model;)
union_type      (Ferrari,   char* brand, *model;)
union_end   (Car)

We can create a Car either on the stack, as below, or on the heap and we can set its value with union_set.

Notice the usage of the new struct construction syntax to simulate optional named parameters in C. I would prefer not to have a dot at the start of the name, but overall it is beautiful (if I can say that myself).

static void printCar(Car*);

static void testUnion() {
    Car c;

    union_set   (&c, Volvo, .x = 3, .y = 4);
    printCar    (&c);
    union_set   (&c, Ferrari, .brand = "Ferrari");
    printCar    (&c);
    union_set   (&c, Fiat, .brand = "Fiat", .model = "234");
    printCar    (&c);
}

You can then access values inside your discriminated union with normal if statements.

static void testCar(Car*, char const *);

static void printCar(Car* c) {

    if(c->kind == Volvo) {
        int x = c->Volvo.x;
        g_assert_cmpint(x, ==, 3);
    }

Or perhaps you want the equivalent of a match statement in F# (aka an expression that returns a value based on the type of the discriminated union). Notice that, as logical, all the expressions need to return the same type. That’s why union_fail takes a value of the expression type.

    char temp[40];

    char* value =   c->kind == Volvo    ?   itoa(c->Volvo.x, temp, 10)
                  : c->kind == Ferrari  ?   (void)c->Ferrari.model, c->Ferrari.brand
                  : c->kind == Fiat     ?   c->Fiat.model
                                        :   union_fail("Not a valid car type");

If you are willing to be gcc specific, then your expression can be comprised of multiple statements, of which the last one returns the value of the expression. This allows a much more flexible syntax for your match clauses.

#ifdef __GNUC__

    value       =   c->kind == Volvo    ? ({
                                            struct Volvo* it = &c->Volvo;
                                            itoa(it->x, temp, 10);
                                          })
                  : c->kind == Ferrari  ?   (void)c->Ferrari.model, c->Ferrari.brand
                  : c->kind == Fiat     ?   c->Fiat.model
                                        :   union_fail("Not a valid car type");

    testCar(c, value);

#endif // __GNUC__
}

We then use the super simple test framework in GLib to make sure that it all works as expected …

static void testCar(Car* c, char const * value) {
    if(c->kind == Volvo) g_assert_cmpstr(value, ==, "3");
    else if (c->kind == Fiat) g_assert_cmpstr(value, ==, "234");
    else if (c->kind == Ferrari) g_assert_cmpstr(value, ==, "Ferrari");
    else g_assert_not_reached();

}

0.2 Nested functions and lambda variables

GCC has many other cool extensions. A very simple one is nested functions. It allows you to nest functions :-) Look at the definition of doA and f2 in the function below. Putting together nested functions and block statement expressions allows you, with some macro magic, to define lambda functions in your code (from here ).

Remember that lambdas (aka nested functions) are allocated on the stack. They are very fast, but you cannot store their pointer into a gloal table (unless such table is used while the stack for this function is alive).

In such cases, you have to create a proper function. But for the other 90% of use cases, they work pretty well. They are lambdas in the spirit of C: very fast, but error prone …

#ifdef __GNUC__

static void testLambda() {

    typedef int (*aFunc) (int);

    aFunc doA(aFunc f){

        int k(int i) {
            return f(i) + 3;
        }
        return k;
    }

    int clos = 2;

    int f2 (int i) {return i;}
    aFunc b = doA(lambda (int, (int p) {return p + clos;}));

    g_assert_cmpint(b(3), ==, 8);
}

0.3 Automatic cleanup of local variables

This is not a functional topic per se, but something that always annoyed me tremendously about C. The fact that you cannot define the equivalent of the using statement in C#, or destructors in C++. Well, now you can. Or not?

Again, if you are willing to be GCC specific, you can use an attribute (more on this in the upcoming implementation post) to associate a cleanup function that gets called when your variable goes out of scope. In this case, I wrapped the free case in a nice looking macro.

But that doesn’t really work. You would certainly want such function to be called on any kind of exit from the enclosing scope (aka via exit(), abort() or longjmp()). Alas, that doesn’t happen.

This reduces the usefulness of this mechanism tremendously. Probably too much in that it lulls you into a false sense of security. You still need to free your resources in the error path of your application.

static void testAutomaticCleanup() {
    char* stack_alloc() {
        auto_free char* b = g_malloc(10000);
        memset(b, '#', 10000);
        return b;
    };

    char * c = stack_alloc();
    g_assert(*c != '#');
}

#endif

0.4 Data structures

GLib comes with a vast library of data structures to use, not too different from the .NET framework or Java. For example, below you have a single linked list …

static void testGLib() {
     GSList* list = NULL;

     list = g_slist_append(list, "Three");
     list = g_slist_prepend(list, "first");
     g_assert_cmpint(g_slist_length(list), ==, 2);

     list = g_slist_remove(list, "first");
     g_assert_cmpint(g_slist_length(list), ==, 1);

     g_slist_free(list);
}

0.5 Wrapping up

There you go, rising the level of abstraction of C, still keeping it very fast (if you are willing to be gcc bound).

There are other features in functional programming languages that are not in this post. Maybe I’ll get around to macro my way into them eventually, maybe not.

In the next post we’ll talk about how all of this is implemented. Below is the code for running the testcases.

int runTests(int argc, char* argv[]) {
    g_test_init(&argc, &argv, NULL);

    if(g_test_quick()) {
        g_test_add_func("/utils/automatic_cleanup", testAutomaticCleanup);
        g_test_add_func("/utils/lambda", testLambda);
        g_test_add_func("/utils/Union", testUnion);
        g_test_add_func("/utils/SList", testGLib);
    }

    return g_test_run();
}

int main(int argc, char *argv[]) {
    return runTests(argc, argv);
}