Writing functional code in C++ IV – Algebraic datatypes

And here comes the guilt bit. I have the strong suspicion (but not certainty) that what I am doing here can be done with templates, but didn’t take the time to do it. With that out of the way, let’s go.

Code for this post is here. Thanks to Steve Bower and Andy Sawyer for reviewing it.

Algebraic datatypes (discriminated unions in F#) are a powerful concept in functional programming. They are the main way to represent type variation in your program. Very roughly, where object orientation uses derivation, functional programming uses algebraic datatypes. An entire book could be written on the theory of this, but the goal of this post is to see how we can map them to C++ without loosing C++ness.

When talking about this with C++ programmers, they always point me to boost variant. That doesn’t do it for me for several reasons.

First of all, boost variants represent one of a fixed collection of types. Algebraic datatypes represent one of a fixed collection of named types. That means that a simple thing like the code below cannot be represented as variant:

type LivingEntity =
| Person of string  // name
| Dog of string     // name

Ok, ok maybe you could represent it by ‘typifing’ things using boost strong typedef, but things get ugly syntactically. Moreover, a lot of time the name is all you care about …

type Switch = On | Off

Are we going to strong typedef for such a simple thing? Oh boy. Even accepting the syntactic weight of all this, there are other issues in play. Discriminated unions are used extensively in functional programs. So you want a nice syntax to work with them Something like the below F# syntax:

let print living =
    match living with
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

Which could be made even sweeter by using the ‘function’ keyword as below:

let print = function
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

In boost variant, you either use the get<type> functions or you write a visitor function. In the first case you are probably going to write a chain of ‘if’ statements or a ‘switch’ statement. Both are confusing and come with plenty of syntactic weight. I don’t really want to write a visitor like the one below for each ‘match’ in my code. The magic is gone.

class times_two_visitor
    : public boost::static_visitor<>
{
public:

    void operator()(int & i) const
    {
        i *= 2;
    }

    void operator()(std::string & str) const
    {
        str += str;
    }

};

Ok, so boost variant doesn’t really work for this. Remember that our overarching goal was to stay close to C++. The language itself has something that comes pretty close to what we want in the form of a union, or better a tagged union. Again, the types are not named, but maybe we can work that in.

It turns out that Jared here did all the hard work. The general idea is to use macros to hide the construction of a tagged union with methods to test the type and return the contained value.

For example this code:

DU_DECLARE(LivingEntity)
    DU_VALUE(Person,    string)
    DU_VALUE(Dog,       string)
DU_END

Becomes something like:

struct LivingEntity {
    private:
        LivingEntity() {}
        unsigned int m_kind;
    public:
        static LivingEntity Person(const string& value) {
            LivingEntity unionValue;
            unionValue.m_kind = 19;
            unionValue.m_Person = value;
            return unionValue; }

        bool IsPerson() const {
            return m_kind == 19;
        }
        const string& GetPerson() const {
            (void)( (!!(m_kind == 19)) || (_wassert(L"m_kind == __LINE__", L"c:discriminated_union.cpp", 19), 0) );
            return m_Person; }
        string GetPerson() {
            (void)( (!!(m_kind == 19)) || (_wassert(L"m_kind == __LINE__", L"c:discriminated_union.cpp", 19), 0) );
            return m_Person; }
   private:
string m_Person;

You can see the outline of a tagged union (i.e. m_kind) with a constructor for each type (i.e. Person) and methods to test for at type and return its value. You can also see the storage for the value (i.e. m_Person).

The only thing in DU_DECLARE that is different from Jared’s solution is the typedef below that allows not repeating the LivingEntity name in each DU_VALUE.

#define DU_DECLARE(name)                        \
    struct name {                               \
    private:                                    \
        typedef name unionName;                 \
        name() {}                               \
        unsigned int m_kind;                    \
    public:
#define DU_VALUE(entryName, entryType)                                                                      \
        static unionName entryName(const entryType& value) {                                                \
            unionName unionValue;                                                                           \
            unionValue.m_kind = __LINE__;                                                                   \
            unionValue.m_##entryName = value;                                                               \
            return unionValue;  }                                                                           \
        bool Is##entryName() const { return m_kind == __LINE__;}                                            \
        const entryType& Get##entryName() const { assert(m_kind == __LINE__); return m_##entryName; }       \
        entryType Get##entryName() { assert(m_kind == __LINE__); return m_##entryName; }                    \
    private:                                                                                                \
        entryType m_##entryName;                                                                            \
    public:

With all of that at your disposal it becomes easy to write:

    auto entity = LivingEntity::Dog("Bob");

    DU_MATCH(entity)
        DU_CASE(Dog,   BOOST_CHECK_EQUAL(value, "Bob");)
        DU_CASE(Person,BOOST_CHECK(false);) 
    DU_MATCH_END

There are some beautiful things about this. First of all, the construction of any of such types is super simple. You even get intellisense!

Moreover the ‘value’ variable contains whatever was passed in the constructor for the object. So this is semantically equivalent, if not syntactically, to the match statement in F#.

Obviously the code part is not limited to a single instruction:

    DU_MATCH(entity)
        DU_CASE(Dog,
            cout << "I should be here";
            BOOST_CHECK_EQUAL(value, "Bob");
        )
        DU_CASE(Person,
            BOOST_CHECK(false);
        ) 
    DU_MATCH_END

And for those of you addicted to braces, I venture:

    DU_MATCH(entity)
        DU_CASE(Dog,
        {
            cout << "I should be here";
            BOOST_CHECK_EQUAL(value, "Bob");
        })
        DU_CASE(Person,
        {
            BOOST_CHECK(false);
        }) 
    DU_MATCH_END

They all work with the same macro definition. They expand to something along the line of:

    if(false) {}
        else if(entity.IsDog()) {
            auto value = entity.GetDog();
            BOOST_CHECK_EQUAL(value, "Bob");
        }
        else if(entity.IsPerson()) {
            auto value = entity.GetPerson();
            BOOST_CHECK(false);
        } 
        else {
            throw match_exception();
        }

I’ve not reached the pinnacle of macro naming mastering with this one. Making them lowercase and risking a bit more on the conflict side would make the syntax much more palatable. I call it, as it is, not too bad.

The last ‘else’ clause assures you then if you add a new type to the discriminated union and forget to update one of the ‘MATCH’ clauses at least you get a run time error. That is not good. Functional languages would give you a compile time error, which is much better. Maybe with judicious use of templates you can bring the error at compile time.

The macros are trivial:

class match_exception: std::exception {};

#define DU_MATCH(unionName) { auto du_match_var = unionName; if(false) {}

#define DU_CASE_TAG(entry, ...)                        \
    else if(du_match_var.Is##entry()) {                \
        __VA_ARGS__                                    \
    }

#define DU_CASE(entry, ...)                            \
    else if(du_match_var.Is##entry()) {                \
        auto value = du_match_var.Get##entry();        \
        __VA_ARGS__                                    \
    }

#define DU_DEFAULT(...)                                \
    else if(true) { __VA_ARGS__}

#define DU_MATCH_END else {throw new match_exception();} }

Let’s now go back to our initial goal and see how far off we are. We were trying to do the following:

type LivingEntity =
| Person of string  
| Dog of string  


let print = function
    | Person(name) -> printfn "I'm a per named %s" name
    | Dog(name)    -> printfn "I'm a dog named %s" name

And here is what we ended up with:

DU_DECLARE(LivingEntity)
    DU_VALUE(Person,    string)
    DU_VALUE(Dog,        string)
DU_END

auto print(LivingEntity en) -> void {
    DU_MATCH(entity)
        DU_CASE(Dog,    cout << "I'm a dog named " << value;)
        DU_CASE(Person, cout << "I'm a per named " << value;) 
    DU_MATCH_END
}

In our Switch case:

type Switch = On | Off

You get the good looking :

DU_DECLARE(Switch)
    DU_FLAG(On)
    DU_FLAG(Off)
DU_END

And along the way we lost compile time type safety in the very common case of adding new types to the discriminated union. That’s bad.

Also some of you would strongly dislike the (ab)use of macros. As for me, it looks workable.

Writing functional code in C++ III – Performance of different allocation schemes

Now we know how to represent records and we know how to operate on them using a nice F# like syntax. But how do we store our record in a data structure in the first place?

Code for this post is here. Thanks to Andy Sawyer and Steve Bower for reviewing this.

As it is often the case, C++ gives you many options that are not available in standard functional languages. A mixed blessing of some sort.

  1. You can store them by value or by pointer
  2. If you store them by pointer, you can use normal pointers or smart pointers
  3. If you store them by smart pointer, you can use different ones

Well, storing them by value is the simplest option, but it has two shortcomings:

  1. You pay the price of a full record copy whenever the object need to be copied (i.e. when you add/remove records to/from a vector and whenever the vector needs to resize itself)
  2. If you store a record in two containers, you get two different copies of it. If you modify one, the other won’t be modified

The second issue is not a problem in our scenario as records, by definition, are immutable and structurally equivalent (aka equality for them doesn’t mean pointer equality). The first issue is more tricky. Our records get a copy constructor by default which does a ‘memcpy’ of the object. That should be fast. But fast enough?

To answer this question (and more), I embarked in a little performance test that encompasses most ways to store a pod type in an vector like data structure. For all my test I used this struct:

struct Record {
    int Id;
    char k1[2];
    char k2[2];
    char k3[2];
    char k4[2];
    char k5[2];
    char k6[2];
    char mem[bigBlock];

    void Lock() {}
    void Unlock() {}
};

By recompiling with different values for ‘bigBlock’ I can test what happens when the record size grows. In all tests a record is initialized with this function:

void record_init(Record& r, int i) {

    r.Id = i;
    strcpy(r.k1, "0");
    strcpy(r.k2, "0");
    strcpy(r.k3, "0");
    strcpy(r.k4, "0");
    strcpy(r.k5, "0");
    strcpy(r.k6, "0");
    memset(r.mem, '-', bigBlock);

}

The tests are specific to the scenario I care about: performing functional-like operations on a Range:

struct to_int {
    typedef int result_type;
    int operator() (const Record& r) const {
        return r.Id;
    };
};

function<bool (const Record&)> filter_f = [](const Record& r) {return r.Id < filterNo;};

template <class Range>
int accumulate_filter(const Range& r) {
    return boost::accumulate(
        r | filtered(filter_f) | transformed(to_int()),
        0,
        plus<int>()); 
}

The usage of a function and  a functor is a bit odd. I don’t recall why I did it that way, but it doesn’t matter as it is the same for each test. What the test does is just filtering a bunch of record, transforming them (map) to ints and sum these ints.

How many repetitions of each test are used, how big is the record, how many records the Range contains is specified in these constants:

const int repetitions = 1000;
const int bigBlock = 1000;
const int howMany = 1000;

Time is kept using this clock that wraps boost::chrono:

typedef boost::chrono::process_cpu_clock the_clock;

struct timer {
    timer(): clock_(the_clock::now()) {}

    the_clock::times elapsed() {
        return (the_clock::now() - clock_).count();
    }

    the_clock::time_point clock_;
};

I have tested the following configurations. Adding records by value:

int normal() {

    vector<Record> v;
    for (int i = 0; i < howMany; ++i) {
        Record r;
        record_init(r, i);
        v.push_back(r);
    }

    return accumulate_filter(v);  
}

I then tested adding records using a pod_vector. This is a data structure described here and in “Imperfect C++”. It is a vector that uses as an auto_buffer as the underlying storage. An auto_buffer is a class that uses stack memory if it needs more than a certain number of bytes specified at compile time, otherwise it uses heap memory. It works well if you know at compile time that most allocations for something take at most N bytes, but some might need more. This situation is surprisingly frequent. Unfortunately, your objects need to be POD to use the pod_vector.

I tested it in the case where the stack space is big enough (podVector<howMany*2>) and when it is not (podVector<howMany/10>).

int podVector() {

    pod_vector<Record, size> v;
    for (int i = 0; i < howMany; ++i) {
        Record r;
        record_init(r, i);
        v.push_back(r);
    }

    return accumulate_filter(v);  
}

I then tested just allocating the memory, without freeing it and using boost::pool in it’s ‘local’ form, which means that memory is freed when it goes out of scope (stack-like). The main drawback of the latter is that you cannot return the container/range.

int boostallocator(WhichOne which) {

    boost::pool<> p(sizeof(Record));
    vector<Record*> v;
    Record* r;
    for (int i = 0; i < howMany; ++i) {
        if(which == Boost)
            r = (Record*)p.malloc(); // memory freed at function exit
        else
            r = new Record; // oops, memory leak

        record_init(*r, i);
        v.push_back(r);
    }

    return accumulate_filter(v | indirected);
}

Indirected is needed because we are not talking about pointers. I also tested various variations of shared pointers. First a normal shared_ptr, then a shared_ptr initialized with the boost::singleton_pool and finally a pointer initialized with make_shared.

int pointers(WhichOne which) {

    vector<shared_ptr<Record>> v;
    for (int i = 0; i < howMany; ++i) {
        shared_ptr<Record> r;
        if(which == Normal)
            r = shared_ptr<Record>(new Record);
        else if(which == Boost)
            r = shared_ptr<Record>((Record*)record_pool::malloc(), [](void* p) {record_pool::free(p);});
        else if(which == Make)
            r = make_shared<Record>();
        else throw "This kind of pointer doesn't exist";

        record_init(*r, i);
        v.push_back(r);
    }

    return accumulate_filter(v | indirected);
}

Finally, I used a Loki smart pointer. This is a very elegantly designed smart pointers from  “Modern C++ design”. You pass as template parameters the policies you want your smart pointer to have (aka how it should behave). I tested it like so:

typedef Loki::SmartPtr<Record,
                 Loki::RefCounted,
                 Loki::DisallowConversion,
                 Loki::AssertCheck,
                 Loki::DefaultSPStorage,
                 LOKI_DEFAULT_CONSTNESS> RecordPtr;

And using the following, slightly more convoluted code:

int lokipointers(WhichOne) {
    vector<RecordPtr> v;
    for (int i = 0; i < howMany; ++i) {
        RecordPtr r = RecordPtr(new Record());
        record_init(*r, i);
        v.push_back(r);
    }

    auto ret = accumulate_filter(v | transformed(RecordPtr::GetPointer) | indirected);
    return ret;
}

The result of my tests are in this table and at this link. I used VS10 and gcc 4.6.2 on a Windows 7 box. The first number in the (S, N) pair at the top of each column represents  the size of the record, the second one represents the number of objects in the vector. To obtain reasonable numbers, the tests have been run with a different number of iteration for each pair, which means that you can compare the results vertically, but not horizontally.

Results

There are a lot of interesting things in this table. Here are my takeaways. They are obviously specific to this single scenario. You might have different scenarios or different takeaways:

  • Stack allocation is not too bad for small objects, especially for gcc
  • Pod_vector is a good overall performer (even if you don’t get the size right), it is unfortunate that just works with POD types
  • GCC seems to be lagging on the shared_ptr front. Also msvc implementation of the make_shared optimization gives a visible advantage
  • There is not much advantage in using the shared pointer with the boost pooled allocator
  • If you can use the boost local pool allocator, you should. It is faster than stack allocation (in this scenario). Remember that the memory is reclaimed when you exit the function …

So here you have it. A small detour on the performance of different schemes for allocating pod types. As it is often the case in C++, it depends …

BTW: Andy Sawyer told me about his rough algorithm to decide which STL container to use. I reproduce it here:

[BEGIN]

A decision tree for selecting a sequence container:

- I’m in a rush and don’t want to read the rest: use std::deque

- Do we know ahead of time exactly how many elements will be needed (and will they all fit on our stack!) – If so, use std::array.

- Do we need constant-time random access? (Note that we often *think* we do, but actually don’t – YAGNI) If so, then we can eliminate std::list/std::forward_list as candidates.

- Do we need bidirectional iteration? (Again, we need this less often that we think we do). If so, that eliminates std::forward_list

- Will there be a large number of in-the-middle insertion/removal?  (and assuming we haven’t already eliminated them) then std::list/std::forward_list (especially when the contained type is expensive to move/copy).  In extreme cases, this may be a strong enough requirement to override other constraints (e.g. the need for random access). On the other hand, it may be viable to reduce the cost of move/copy of large contained types by using containers of (smart) pointers.

- Do we need the contents as a contiguous array? use std::vector (and call reserve  if we can) – sometimes, anyway; I’ve seen cases where it’s faster to build my collection as a std::deque, then transform to std::vector later on.)

- Use std::deque.

Or, to put it another way, “use std::deque  unless there’s a good reason not to”.  That’s probably an overly-strong statement, but it’s a reasonable starting point.

Naturally, every case will be different – and no set of ‘rules of thumb’ is going to work every time (this is one of the things which distinguish rules of thumb from laws of physics). Commonly, we don’t know ahead of time which of those criteria is going to have the most significant impact on what we’re doing; when I find myself in that situation, I’ll use std::deque; once I have a firmer idea of the criteria, I’ll go back and revisit that choice (which – quite often – is as simple as changing a typedef). And of course – as with all ‘optimisations’ – measure, measure and measure again!

[END]

Next time, we’ll go back to more traditional functional programming topics.

Letter on investing

In 2007 I wrote a blog post on investing. During the last five years, my view evolved a bit. People often ask me how to get a financial education. This is the latest email I sent on the topic.

From: Bolognese, Luca
Sent: 04 April 2012 16:52
To: XXX
Subject: RE: A financial education

Disclaimer: this are just my personal opinions drawn from my 10+ years of investing, reading books and academic papers about it. I can justify most statements below with some academic reference, but that would make this email too long.

Also someone else might read the same material and come up with a different opinion. It is a field that is probabilistic in nature and doesn’t lend itself to certainties. For example, you can make a lot of money out of an investment and still be wrong to have made it. And conversely.

Our brains don’t work well in such fields.

  • The most important thing in investing is not investing, it is saving the money to invest. If you save enough and are mildly reasonable in your investments (aka you diversify), you are going to be ok. Saving is not mechanically difficult, it is motivationally difficult. The best book I found on the topic is this one. I read earlier editions, this is a new one. I don’t like the investment chapters.
  • After that, you need to make sure that your financial matters are in order (i.e. you are ensured against catastrophes, you maximize your tax deductions, etc..). This is the field of personal finance. I’ve been suggesting this book to American audiences. It is on Amazon.co.uk, so it might be the UK version.
  • Now that you have saved money and your finances are in order, you can start thinking about investing. The most important things in investing are: deciding what you believe in, what your risk tolerance is, how important is for your performance to track the market, what your time horizon is and how much time you want to dedicate to investing.
    • My risk tolerance is high. I’ve spent a lot of time learning about investing. I’m willing to see the value of my portfolio go down without experiencing emotional distress (I’ve tried it)
    • I don’t care about my investment performance tracking the market. I don’t watch financial programs.
    • My time horizon is long. These are money I invest for my retirement.
    • I’m willing to invest time to keep myself up to date on investment topics and manage my portfolio
    • I believe the following (these are the biases that colour my view):
    • Diversification among different sources of returns is to be sought aggressively
    • Most asset classes are not efficient because of:
      • Predictable flaws in human brain’s processing machinery
      • Institutional constraints on most professional investor (aka they have to track the market they invest in closely)
      • Short term performance incentive for professional investors
    • Some asset classes are intrinsically more inefficient than others (i.e. emerging markets micro stocks compared to US large stock) because that’s where the causes of inefficiencies are strongest
    • The biggest inefficiencies are among asset classes (i.e. stocks vs bonds vs commodities)
    • Most people (myself included) don’t have the time to do the research necessary to invest in individual stocks using the two ways that I believe are ‘right’: quant models or/and fundamental value analysis.
  • Some books to get you started:
    • A good perspective on how to think about the market and the work you need to do if you want to invest in stocks. The intelligent investor.
    • Why diversification is important. The intelligent asset allocator. I don’t believe in static asset allocation, but you need to know what it is to believe (or not) in it.
    • Moving beyond a static asset allocation with a quant based highly diversified, but simple to implement system. The Ivy Portfolio. I am very tempted to use this one myself
    • Why asset classes are often mispriced. Probably the premier quant shop around. Read most quarterly letters.
    • Systems to take advantage of such mispricing are described here. I have obtained copies of all the newsletters sent by the author going back to 1991 and backtested some of the systems. These are the system I use. But the newsletter is relatively expensive and complex to follow.
    • If you decide to branch out into value analysis of companies. Essays of warren buffett. You would also need to read an introductory and an advanced accounting text. I don’t know UK accounting, so cannot suggest good candidates.
    • If you are very conservative and want to just invest in Inflation Linked Bonds (or fearful of markets in general), there is a book for that: here.
  • With all of that out of the way, here is some practical counsel
    • if you have less than 20,000 pounds to invest try to find a fund or etf that invest in a diversified array of asset classes for a small price. I don’t know the UK market, so cannot suggest one.
    • If you have more than that and are willing to dedicate a few hours a month, you can start implementing some of the systems in the Ivy Portfolio book. Maybe you integrate it with the GMO 7 years asset class forecasts to push a bit of value bias in your system.
    • For most people that’s it. If it becomes a lifelong interest / mania as it is for me, there is a bit more that can be done, but it is unclear to me that the results are necessarily better than the simple systems described above.

Cheers,

.luca