Becoming really rich with C#

Or maybe not, please do not hold me responsible if you lose money following this system. Having said that, it is my opinion that there are very few concepts that are important in investing. Three big ones are value, diversification and momentum. This post is about the latter two and how to use C# to create a simple trading system that uses both.


Diversification is ‘not put all your eggs in one basket’ (contrary to ‘put all of them in one basket and watch that basket’). I don’t believe you can ‘watch’ very much in financial markets, so I tend to prefer diversification.


Momentum is a mysterious tendency of financial prices that have risen the most in the recent past, to continue outperforming in the close future. In essence, buying the top stocks/sectors/asset classes tends to outperform buying the bottom ones over horizons from three months to one year.


The idea then is to rank some assets (i.e. ETFs) by how fast they have risen in the past, go long the top ones and short the bottom ones. There are hundreds of variations of this basic strategy, we’ll add the rule that we won’t buy assets that are below their 200 days moving average or sell short assets that are above it.


I’m writing this code with VS 2010 Beta 2 (which hasn’t shipped yet). It should be trivial to modify it to run on B1 (or maybe it does run on it already). I attach the code and data files to this post.

struct Event {
internal Event(DateTime date, double price) { Date = date; Price = price; }
internal readonly DateTime Date;
internal readonly double Price;
}

We’ll use this simple structure to load the closing price for a particular date. My use of internal is kind of bizarre. Actually the whole code might look strange. It is an interesting (maybe un-elegant) mix of object orientation and functional programming.

class Summary {
internal Summary(string ticker, string name, string assetClass,
string assetSubClass, double? weekly, double? fourWeeks,
double? threeMonths, double? sixMonths, double? oneYear,
double? stdDev, double price, double? mav200) {
Ticker = ticker;
Name = name;
AssetClass = assetClass;
AssetSubClass = assetSubClass;
// Abracadabra …
LRS = (fourWeeks + threeMonths + sixMonths + oneYear) / 4;
Weekly = weekly;
FourWeeks = fourWeeks;
ThreeMonths = threeMonths;
SixMonths = sixMonths;
OneYear = oneYear;
StdDev = stdDev;
Mav200 = mav200;
Price = price;
}
internal readonly string Ticker;
internal readonly string Name;
internal readonly string AssetClass;
internal readonly string AssetSubClass;
internal readonly double? LRS;
internal readonly double? Weekly;
internal readonly double? FourWeeks;
internal readonly double? ThreeMonths;
internal readonly double? SixMonths;
internal readonly double? OneYear;
internal readonly double? StdDev;
internal readonly double? Mav200;
internal double Price;

internal static void Banner() {
Console.Write(“{0,-6}”, “Ticker”);
Console.Write(“{0,-50}”, “Name”);
Console.Write(“{0,-12}”, “Asset Class”);
//Console.Write(“{0,-30}t”, “Asset SubClass”;
Console.Write(“{0,4}”, “RS”);
Console.Write(“{0,4}”, “1Wk”);
Console.Write(“{0,4}”, “4Wk”);
Console.Write(“{0,4}”, “3Ms”);
Console.Write(“{0,4}”, “6Ms”);
Console.Write(“{0,4}”, “1Yr”);
Console.Write(“{0,6}”, “Vol”);
Console.WriteLine(“{0,2}”, “Mv”);
//Console.Write(“{0,6}”, “Pr”);
//Console.WriteLine(“{0,6}”, “M200”);
}

internal void Print() {

Console.Write(“{0,-6}”, Ticker);
Console.Write(“{0,-50}”, new String(Name.Take(48).ToArray()));
Console.Write(“{0,-12}”, new String(AssetClass.Take(10).ToArray()));
//Console.Write(“{0,-30}t”, new String(AssetSubClass.Take(28).ToArray()));
Console.Write(“{0,4:N0}”, LRS * 100);
Console.Write(“{0,4:N0}”, Weekly * 100);
Console.Write(“{0,4:N0}”, FourWeeks * 100);
Console.Write(“{0,4:N0}”, ThreeMonths * 100);
Console.Write(“{0,4:N0}”, SixMonths * 100);
Console.Write(“{0,4:N0}”, OneYear * 100);
Console.Write(“{0,6:N0}”, StdDev * 100);
if (Price <= Mav200)
Console.WriteLine(“{0,2}”, “X”);
else
Console.WriteLine();
//Console.Write(“{0,6:N2}”, Price);
//Console.WriteLine(“{0,6:N2}”, Mav200);
}
}


The class Summary above is how I want to present my results. A few comments on the code. I use Nullable<T> because some of this values can be null (i.e. not enough history), but I still don’t want to worry about it. It ends up working rather neatly.


I also print the results out to Console, which is crazy. I really should be using WPF/Silverlight as the presentation layer. Also the {0,4:N0} notation might be unfamiliar to some of you, but this is how mad Console guys like myself avoid using real UI frameworks. Sometimes we print things in color too.


The real meat is in the following line:

LRS = (fourWeeks + threeMonths + sixMonths + oneYear) / 4;

That is our highway to richness. It’s a very elaborated quant formula, never before shown, that calculate a magick relative strength (aka momentum) factor as the average of the performance of four weeks, three months, six months and one year.

class TimeSeries {
internal readonly string Ticker;
readonly DateTime _start;
readonly Dictionary<DateTime, double> _adjDictionary;
readonly string _name;
readonly string _assetClass;
readonly string _assetSubClass;

internal TimeSeries(string ticker, string name, string assetClass, string assetSubClass,
IEnumerable<Event> events) {
Ticker = ticker;
_name = name;
_assetClass = assetClass;
_assetSubClass = assetSubClass;
_start = events.Last().Date;
_adjDictionary = events.ToDictionary(e => e.Date, e => e.Price);
}


I then built myself a little TimeSeries class that represents a series of (date, price). I choose a dictionary to store it because of my assumption that I will be accessing it by date a lot. In retrospect, I was kind of right and kind of wrong. It doesn’t really matter much.

bool GetPrice(DateTime when, out double price, out double shift) {
// To nullify the effect of hours/min/sec/millisec being different from 0
when = new DateTime(when.Year, when.Month, when.Day);
var found = false;
shift = 1;
double aPrice = 0;
while (when >= _start && !found) {
if (_adjDictionary.TryGetValue(when, out aPrice)) {
found = true;
}
when = when.AddDays(-1);
shift -= 1;
}
price = aPrice;
return found;
}

A TimeSeries can give you back the price at a particular date. This looks bizarre and complex, but there is a reason for it. I might ask for a date that doesn’t have a price associated with it (i.e. holidays, week-ends). In such cases I want to return the previous price which could be N days in the past.


I also want to return how many days in the past I had to go, so that other calculations (i.e. Return) can modify their end date by the same amount. Also I might not find such a price at all, in which case I don’t want to throw an exception, but instead notify the caller. In retrospect, I should have used double? to signify ‘price not found’.

double? GetReturn(DateTime start, DateTime end) {
var startPrice = 0.0;
var endPrice = 0.0;
var shift = 0.0;
var foundEnd = GetPrice(end, out endPrice, out shift);
var foundStart = GetPrice(start.AddDays(shift), out startPrice, out shift);
if (!foundStart || !foundEnd)
return null;
else
return
endPrice / startPrice – 1;
}

We can now go and calculate the return between two dates. Also the TimeSeries object needs to perform a little more calculations.

    internal double? LastWeekReturn() {
return GetReturn(DateTime.Now.AddDays(-7), DateTime.Now);
}
internal double? Last4WeeksReturn() {
return GetReturn(DateTime.Now.AddDays(-28), DateTime.Now);
}
internal double? Last3MonthsReturn() {
return GetReturn(DateTime.Now.AddMonths(-3), DateTime.Now);
}
internal double? Last6MonthsReturn() {
return GetReturn(DateTime.Now.AddMonths(-6), DateTime.Now);
}
internal double? LastYearReturn() {
return GetReturn(DateTime.Now.AddYears(-1), DateTime.Now);
}
internal double? StdDev() {
var now = DateTime.Now;
now = new DateTime(now.Year, now.Month, now.Day);
var limit = now.AddYears(-3);
var rets = new List<double>();
while (now >= _start.AddDays(12) && now >= limit) {
var ret = GetReturn(now.AddDays(-7), now);
rets.Add(ret.Value);
now = now.AddDays(-7);
}
var mean = rets.Average();
var variance = rets.Select(r => Math.Pow(r – mean, 2)).Sum();
var weeklyStdDev = Math.Sqrt(variance / rets.Count);
return weeklyStdDev * Math.Sqrt(40);
}
internal double? MAV200() {
return _adjDictionary
.ToList()
.OrderByDescending(k => k.Key)
.Take(200)
.Average(k => k.Value);
}
internal double TodayPrice() {
var price = 0.0;
var shift = 0.0;
GetPrice(DateTime.Now, out price, out shift);
return price;
}
internal Summary GetSummary() {
return new Summary(Ticker, _name, _assetClass, _assetSubClass,
LastWeekReturn(), Last4WeeksReturn(), Last3MonthsReturn(),
Last6MonthsReturn(), LastYearReturn(), StdDev(), TodayPrice(),
MAV200());
}
}

Nothing particularly interesting in this code. Just a bunch of calculations. The MAV200 is the 200 days moving average of closing prices. It shows a more functional way of doing things. The StdDev function is instead very imperative.


We now can work on downloading the prices. This is how you construct the right URL:

static string CreateUrl(string ticker, DateTime start, DateTime end) {
return @”http://ichart.finance.yahoo.com/table.csv?s=” + ticker + “&a=”
+ (start.Month – 1).ToString() + “&b=” + start.Day.ToString() + “&c=”
+ start.Year.ToString() + “&d=” + (end.Month – 1).ToString() + “&e=”
+ end.Day.ToString() + “&f=” + end.Year.ToString() + “&g=d&ignore=.csv”;
}

 


And let’s set how many concurrent connections we are going to use …

ServicePointManager.DefaultConnectionLimit = 10;

On my machine, setting this number too high causes errors to be returned. I’m not sure on which side of the connection the problem lies.


We can then load all the tickers we want to load from a file. One of the files has Leveraged ETFs, which I want to filter out because they tend to pop up always at the top.

var tickers =
//File.ReadAllLines(“ETFs.csv”)
//File.ReadAllLines(“ETFTest.csv”)
File.ReadAllLines(“AssetClasses.csv”)
.Skip(1)
.Select(l => l.Split(new[] { ‘,’ }))
.Where(v => v[2] != “Leveraged”)
.Select(values => Tuple.Create(values[0], values[1], values[2], values[3]))
.ToArray();
var len = tickers.Length;

var start = DateTime.Now.AddYears(-2);
var end = DateTime.Now;
var cevent = new CountdownEvent(len);
var summaries = new Summary[len];


And then load all of them, making sure to make an asynchronous call so not to keep the thread busy.

for(var i = 0; i < len; i++)  {
var t = tickers[i];
var url = CreateUrl(t.Item1, start, end);
using (var webClient = new WebClient()) {
webClient.DownloadStringCompleted +=
new DownloadStringCompletedEventHandler(downloadStringCompleted);
webClient.DownloadStringAsync(new Uri(url), Tuple.Create(t, cevent, summaries, i));
}
}

cevent.Wait();


 


Notice the use of a Countdown event to wait for all the thread to complete before printing out the results. Also notice the new Tuple<T> class used to package things to send around.


We can then print out the top and bottom 15%:

var top15perc =
summaries
.Where(s => s.LRS.HasValue)
.OrderByDescending(s => s.LRS)
.Take((int)(len * 0.15));
var bottom15perc =
summaries
.Where(s => s.LRS.HasValue)
.OrderBy(s => s.LRS)
.Take((int)(len * 0.15));

Console.WriteLine();
Summary.Banner();
Console.WriteLine(“TOP 15%”);
foreach(var s in top15perc)
s.Print();

Console.WriteLine();
Console.WriteLine(“Bottom 15%”);
foreach (var s in bottom15perc)
s.Print();


 


Here is what we do when a request comes back with data:

static void downloadStringCompleted(object sender, DownloadStringCompletedEventArgs e) {
var bigTuple =
(Tuple<Tuple<string, string, string, string>, CountdownEvent, Summary[], int>)
e.UserState;
var tuple = bigTuple.Item1;
var cevent = bigTuple.Item2;
var summaries = bigTuple.Item3;
var i = bigTuple.Item4;
var ticker = tuple.Item1;
var name = tuple.Item2;
var asset = tuple.Item3;
var subAsset = tuple.Item4;

if (e.Error == null) {
var adjustedPrices =
e.Result
.Split(new[] { ‘n’ })
.Skip(1)
.Select(l => l.Split(new[] { ‘,’ }))
.Where(l => l.Length == 7)
.Select(v => new Event(DateTime.Parse(v[0]), Double.Parse(v[6])));

var timeSeries = new TimeSeries(ticker, name, asset, subAsset, adjustedPrices);
summaries[i] = timeSeries.GetSummary();
cevent.Signal();
Console.Write(“{0} “, ticker);
}
else {
Console.WriteLine(“[{0} ERROR] “, ticker);
//Console.WriteLine(e.Error);
summaries[i] = new Summary(ticker, name, “ERROR”, “ERROR”, 0, 0, 0, 0, 0, 0,0,0);
cevent.Signal();
}
}


We first unpack the Tuple we sent out originally, we then extract the Date and Price, create a Summary object and store it in the summaries array. It’s important to remember to Signal to the cevent in the error case as well because we want to print out the results even if some downloading failed.


And here is what you get for your effort:


image

SystemCodeAndData.zip

Advertisements

LAgent: an agent framework in F# – Part IX – Counting words …

Download framework here.

All posts are here:

    Let’s now use our mapReduce to do something more interesting, for example finding the frequency of words in several books. Now the agent that processes the output needs to be a bit more complex.

    let gathererF = fun msg (data:List<string * int>, counter, step) ->
                        match msg with
                        | Reduced(key, value)   ->
                            if counter % step = 0 then
                                printfn "Processed %i words. Now processing %s" counter key 
                            data.Add((key, value |> Seq.hd))
                            data, counter + 1, step
                        | MapReduceDone         ->
                            data
                            |> Seq.distinctBy (fun (key, _) -> key.ToLower())
                            |> Seq.filter (fun (key, _) -> not(key = "" || key = """ ||
    (fst (Double.TryParse(key))))) |> Seq.to_array |> Array.sortBy snd |> Array.rev |> Seq.take 20 |> Seq.iter (fun (key, value) -> printfn "%Att%A" key value) printfn "All done!!" data, counter, step let gatherer = spawnAgent gathererF (new List<string * int>(), 0, 1000)

    Every time a new word is reduced, a message is printed out and the result is added to a running list. When everything is done such a list is printed out by first manipulating it to reduce weirdness and limit the number of items. BTW: there are at least two bugs in this code, maybe more (late night quick-and-dirty-see-if-the-algo-works kind of coding).

    We want to maximize the number of processors to use, so let’s split the books in chunks so that they can be operated in parallel. The code below roughly does it (I say roughly because it doesn’t chunk the lines in the right order, but for this particular case it doesn’t matter).

    let gatherer = spawnAgent gathererF (new List<string * int>(), 0, 1000)
    
    let splitBook howManyBlocks fileName =
        let buffers = Array.init howManyBlocks (fun _ -> new StringBuilder())
        fileName
        |> File.ReadAllLines
        |> Array.iteri (fun i line -> buffers.[i % (howManyBlocks)].Append(line) |> ignore)
        buffers
    
    let blocks1 = "C:UserslucabolDesktopAgentsAgentskjv10.txt" |> splitBook 100
    let blocks2 = "C:UserslucabolDesktopAgentsAgentswarandpeace.txt" |> splitBook 100
    let input =
        blocks1
        |> Array.append blocks2
        |> Array.mapi (fun i b -> i.ToString(), b.ToString())

    And let’s execute!!

    mapReduce input map reduce gatherer 20 20 partitionF

    On my machine I get the following, which could be the right result.

    "a"        16147
    "And"        13071
    "I"        11349
    "unto"        8125
    "as"        6400
    "her"        5865
    "which"        5544
    "from"        5378
    "at"        5175
    "on"        5155
    "have"        5135
    "me"        5068
    "my"        4629
    "this"        3782
    "out"        3653
    "ye"        3399
    "when"        3312
    "an"        2841
    "upon"        2558
    "so"        2489
    All done!!

    LAgent: an agent framework in F# – Part VIII – Implementing MapReduce (user model)

    Download framework here.

    All posts are here:

      For this post I use a newer version of the framework that I just uploaded on CodeGallery. In the process of using LAgent I grew more and more unhappy with the weakly typed way of sending messages. The code that implements that feature is nasty: full of upcasts and downcasts. I was losing faith in it. Bugs were cropping up in all sorts of scenarios (i.e. using generic union types as messages).

      In the end I decided to re-architecture the framework so to make it strongly typed. In essence now each agent can just receive messages of a single type. The limitations that this design choice introduces (i.e. more limited hot swapping) are compensated by the catching of errors at compile time and the streamlining of the code. I left the old framework on the site in case you disagree with me.

      In any case, today’s post is about MapReduce. It assumes that you know what it is (link to the original Google paper that served as inspiration is here: Google Research Publication- MapReduce). What would it take to implement an in-memory MapReduce using my agent framework?

      Let’s start with the user model.

      let mapReduce   (inputs:seq<'in_key * 'in_value>)
                      (map:'in_key -> 'in_value -> seq<'out_key * 'out_value>)
                      (reduce:'out_key -> seq<'out_value> -> seq<'reducedValues>)
                      outputAgent
                      M R partitionF =                

      mapReduce takes seven parameters:

      1. inputs: a sequence of input key/value pairs.
      2. map: this function operates on each input key/value pair. It  returns a sequence of output key/value pairs. The type of the output sequence can be different from the type of the inputs.
      3. reduce: this function operates on an output key and all the values associated with it. It returns a sequence of reduced values (i.e. the average of all the values for this key)
      4. ouputAgent: this is the agent that gets notified every time a new output key has been reduced and at the end when all the operation ends.
      5. M: how many mapper agents to instantiate
      6. R: how many reducer agents to instantiate
      7. partitionF: the partition function used to choose which of the reducers is associated with a key

      Let’s look at how to use this function to find how often each word is used in a set of files. First a simple partition function can be defined as:

      let partitionF = fun key M -> abs(key.GetHashCode()) % M 

      Given a key and some buckets, it picks one of the buckets. Its type is: ‘a –> int –> int, so it’s fairly reusable.

      Let’s also create a basic agent that just prints out the reduced values:

      let printer = spawnWorker (fun msg ->
                                  match msg with
                                  | Reduced(key, value)   -> printfn "%A %A" key value
                                  | MapReduceDone         -> printfn "All done!!")

      The agent gets notified whenever a new key is reduced or the algorithm ends. It is useful to be notified immediately instead of waiting for everything to be done. If I hadn’t written this code using agents I would have not realized that possibility. I would simply have framed the problem as a function that takes an input and returns an output. Agents force you to think explicitly about the parallelism in your app. That’s a good thing.

      The mapping function simply split the content of a file into words and adds a word/1 pair to the list. I know that there are much better ways to do this (i.e. regular expressions for the parsing and summing words counts inside the function), but I wanted to test the basic framework capabilities and doing it this way does it better.

      let map = fun (fileName:string) (fileContent:string) ->
                  let l = new List<string * int>()
                  let wordDelims = [|' ';',';';';'.';':';'?';'!';'(';')';'n';'t';'f';'r';'b'|]
                  fileContent.Split(wordDelims) |> Seq.iter (fun word -> l.Add((word, 1)))
                  l :> seq<string * int>

      The reducer function simply sums the various word statistics sent by the mappers:

      let reduce = fun key (values:seq<int>) -> [values |> Seq.sum] |> seq<int>

      Now we can create some fake input to check that it works:

      let testInput = ["File1", "I was going to the airport when I saw someone crossing";
      "File2", "I was going home when I saw you coming toward me"]

      And execute the mapReduce:

      mapReduce testInput map reduce printer 2 2 partitionF

      On my machine I get the following. You might get a different order because of the async/parallel processing involved. If I wanted a stable order I would need to change the printer agent to cache results on Reduced and process them on MapReduceDone (see next post).

      "I" [4]

      "crossing" [1]

      "going" [2]

      "home" [1]

      "me" [1]

      "the" [1]

      "toward" [1]

      "airport" [1]

      "coming" [1]

      "saw" [2]

      "someone" [1]

      "to" [1]

      "was" [2]

      "when" [2]

      "you" [1]

      In the next post we’ll process some real books …