Friday, July 25, 2014

Cartesian product (redux)

Cartesian product (redux)

In this blog post we looked at programs to compute Cartesian Products. One algorithm (given here in OCaml) if you recall is
let prod l r =
  let g acc a =
    let f acc x =
      (a, x) :: acc
    in List.fold_left f acc r
  in List.fold_left g [] l |> List.rev

In that earlier post I translated the above program into C++. This time I'm doing the same straightforward translation but using the Boost.MPL library to compute such products at compile time. It looks like this:

#include <boost/mpl/pair.hpp>
#include <boost/mpl/vector.hpp>
#include <boost/mpl/push_front.hpp>
#include <boost/mpl/fold.hpp>
#include <boost/mpl/reverse.hpp>
#include <boost/mpl/placeholders.hpp>

using namespace boost::mpl::placeholders;

template <class L, class R>
struct product
{
  struct g {
    template <class AccT, class A>
    struct apply {
       typedef typename boost::mpl::fold <
         R, AccT
        , boost::mpl::push_front<_1, boost::mpl::pair<A, _2> > 
         >::type type;
    };
  };

  typedef typename boost::mpl::reverse <
    typename boost::mpl::fold <
                  L, boost::mpl::vector<>, g>::type>::type type;
};
The translation proceeds almost mechanically! Does it work though? You betcha! Here's a test we can convince ourselves with:
#include <boost/mpl/equal.hpp>
#include <boost/mpl/for_each.hpp>
#include <boost/mpl/int.hpp>

#include <iostream>

using namespace boost::mpl;

typedef vector<int_<1>, int_<2> > A;
typedef vector<int_<3>, int_<4>, int_<5> > B;
typedef product <A, B>::type result;

BOOST_MPL_ASSERT((
  equal<
    result
  , vector<
         pair<int_<1>, int_<3> >
       , pair<int_<1>, int_<4> >
       , pair<int_<1>, int_<5> >
       , pair<int_<2>, int_<3> >
       , pair<int_<2>, int_<4> >
       , pair<int_<2>, int_<5> >
      > 
  > ));

struct print_class_name {
    template <typename T>
    void operator()( T t ) const {
       std::cout << typeid(t).name() << " ";
    }
};

int main ()
{
  boost::mpl::for_each<result>(print_class_name());

  return 0;
}
The takeaway is, learning yourself some functional programming is a great way to improve your template meta-programming fu! (That of course should come as no surprise... )

Friday, July 18, 2014

Merge sort

Merge sort

Here's another divide and conquer algorithm. This one is for sorting a sequence.

Conceptually, a merge sort works like this (see http://en.wikipedia.org/wiki/Merge_sort):

  • Divide the unsorted list into n sub-lists, each containing one element (a list of one element is trivially sorted);
  • Repeatedly merge sublists to produce new sorted sub-lists until there is only one sub-list remaining : this will be the sorted list.

In the following OCaml, we are drawing on inspiration from the the Haskell standard prelude for the part of the algorithm concerned with dividing the unsorted list : functions take, drop and split.

(**{b Merge sort}, an {i O(n log n)} comparison based sorting
   algorithm *)
module type MERGESORT = sig

  (**[take n] applied to a list [xs], returns the prefix of [xs] of
     length [n] or [xs] itself if [n > List.length xs] e.g. [take 2
     [1; 2; 3]] {i = } [[1; 2]]*)
  val take : int -> 'a list -> 'a list

  (**[drop n] applied to a list [xs], returns the suffix of [xs]
     after the first [n] elements or, [[]] if [n > List.length xs]
     e.g. [drop 2 [1; 2; 3]] {i = } [[3]]*)
  val drop : int -> 'a list -> 'a list

  (**[split n xs] is equivalent to [take n xs, drop n xs] e.g.
     [split 2 [1; 2; 3]] {i = } [([1; 2], [3])]*)
  val split : int -> 'a list -> 'a list * 'a list

  (**[merge] given two {b sorted} sequences [xs] and [ys] returns a
     single sorted sequence of the elements of [xs] and [ys]
     e.g. [merge [1; 3] [2; 4]] {i = } [[1; 2; 3; 4]]*)
  val merge : 'a list -> 'a list -> 'a list

  (**[merge_sort] works by splitting a sequence into two parts,
     recursively sorting the two parts and merging the results into
     a single sorted sequence e.g. [merge_sort [1; 2; -1; 0; 3]] 
     {i = } [[-1; 0; 1; 2; 3]]*)
  val merge_sort : 'a list -> 'a list
end

module Merge_sort : MERGESORT = struct

  let rec take k l =
    match (k, l) with
    | n, _ when n <= 0 -> []
    | _, [] -> []
    | n, (x :: xs) -> x :: take (n - 1) xs

  let rec drop k l =
    match (k, l) with
    | n, xs when n <= 0 -> xs
    | _, [] -> []
    | n, (_ :: xs) -> drop (n - 1) xs

  let rec split k l = take k l, drop k l

  let rec merge l m =
    match (l, m) with
    | [], ys -> ys
    | xs, [] -> xs
    | ((x :: xs) as t), ((y :: ys) as s) -> 
      if x <= y then x :: (merge xs s) else y :: (merge t ys)
        
  let rec merge_sort l =
    let i = (List.length l) / 2 in
    if i = 0 then l
    else
      let u, v = split i l in
      let xs, ys = merge_sort u, merge_sort v in
      merge xs ys

end

We can test our little program in the top-level like this:

# let l = Merge_sort.merge_sort [-1; 2; 0; 4; 3; 5];;
val l : int list = [-1; 0; 2; 3; 4; 5]

Here are the functions in C++ phrased as range algorithms.

//"c:/users/fletch/desktop/setenv.cmd"
//cl /EHsc /Femerge.exe /I c:/project/boost_1_55_0 merge.cpp 

#include <boost/next_prior.hpp>
#include <boost/range.hpp>
#include <boost/range/algorithm/copy.hpp>

#include <vector>
#include <cstdlib>

namespace algo
{
  //take

  template <class S, class D>
  D take (std::size_t n, S src, D dst)
  {
    typedef boost::range_iterator<S>::type it_t;
    it_t curr = boost::begin (src), last = boost::end (src);
    if (n <= 0) return dst;
    if (boost::empty (src))  return dst;
    take (n-1, S (boost::next (curr), last), *dst++ = *curr);
    return dst;
  }

  //drop

  template <class S, class D>
  D drop (std::size_t n, S src, D dst)
  {
    typedef boost::range_iterator<S>::type it_t;
    it_t curr = boost::begin (src), last = boost::end (src);
    if (n <= 0) return boost::range::copy (src, dst);
    if (boost::empty (src))return dst;
    drop (n-1, S (boost::next (curr), last), dst);
    return dst;
  }

  //split

  template <class S, class D1, class D2>
  void split (int n, S src, D1 dst1, D2 dst2)
  { take (n, src, dst1); drop (n, src, dst2); }

  //merge

  template <class S1, class S2, class D>
  D merge (S1 src1, S2 src2, D dst)
  {
    typedef boost::range_iterator<S1>::type it1_t;
    typedef boost::range_iterator<S2>::type it2_t;
    typedef std::iterator_traits<it1_t>::value_type value1_t;
    typedef std::iterator_traits<it2_t>::value_type value2_t;
    if (boost::empty (src1)) return boost::copy (src2, dst);
    if (boost::empty (src2)) return boost::copy (src1, dst);
    it1_t curr1 = boost::begin (src1), last1 = boost::end (src1);
    it2_t curr2 = boost::begin (src2), last2 = boost::end (src2);
    value1_t x = *curr1;
    value2_t y = *curr2;
    if (x <= y)
      merge (S1 (boost::next (curr1), last1), src2, *dst++ = x);
    else
      merge (src1, S2 (boost::next (curr2), last2), *dst++ = y);
    return dst;
  }

  //merge_sort

  template <class S, class D>
  D merge_sort (S src, D dst)
  {
    typedef boost::range_iterator<S>::type it_t;
    typedef std::iterator_traits<it_t>::value_type value_t;
    std::size_t i = boost::size (src)/2;
    if (i == 0) return boost::range::copy (src, dst);
    std::vector<value_t> u, v;
    split (i, src, std::back_inserter (u), std::back_inserter (v));
    std::vector<value_t> xs, ys;
    merge_sort (u, std::back_inserter (xs));
    merge_sort (v, std::back_inserter (ys));
    merge (xs, ys, dst);
    return dst;
  }
  
}//namespace algo

The following usage example provides a lightweight test.

#include <boost/assign/list_of.hpp>

#include <utility>
#include <iterator>
#include <iostream>

int main ()
{
  using std::pair;
  using std::make_pair;
  using std::ostream_iterator;
  using boost::assign::list_of;

  int ord[] = {1, 2, 3, 4};

  auto src=make_pair(ord, ord + 4);
  auto dst=ostream_iterator<int>(std::cout, ", ");

  std::cout << "\ntake ():\n";

  algo::take (0u, src, dst); std::cout << std::endl;
  algo::take (1u, src, dst); std::cout << std::endl;
  algo::take (2u, src, dst); std::cout << std::endl;
  algo::take (3u, src, dst); std::cout << std::endl;
  algo::take (4u, src, dst); std::cout << std::endl;
  algo::take (5u, src, dst); std::cout << std::endl;

  std::cout << "\ndrop ():\n";

  algo::drop (5u, src, dst); std::cout << std::endl;
  algo::drop (4u, src, dst); std::cout << std::endl;
  algo::drop (3u, src, dst); std::cout << std::endl;
  algo::drop (2u, src, dst); std::cout << std::endl;
  algo::drop (1u, src, dst); std::cout << std::endl;
  algo::drop (0u, src, dst); std::cout << std::endl;

  std::cout << "\nsplit ():\n\n";

  algo::split (0u, src, dst, dst); std::cout << std::endl;
  algo::split (1u, src, dst, dst); std::cout << std::endl;
  algo::split (2u, src, dst, dst); std::cout << std::endl;
  algo::split (3u, src, dst, dst); std::cout << std::endl;
  algo::split (4u, src, dst, dst); std::cout << std::endl;
  algo::split (5u, src, dst, dst); std::cout << std::endl;

  std::cout << "\nmerge ():\n\n";

  std::vector <int> l=list_of(-1)(2), r=list_of(0)(1);
  algo::merge (l, r, dst); std::cout << std::endl;

  std::cout << "\nmerge_sort ():\n\n";

  int unord[] = {-1, 2, 0, 4, 3, 5};
  algo::merge_sort (make_pair (unord, unord + 6), dst);

  return 0;
}
The above program produces the following output.
take ():

1,
1, 2,
1, 2, 3,
1, 2, 3, 4,
1, 2, 3, 4,

drop ():


4,
3, 4,
2, 3, 4,
1, 2, 3, 4,

split ():

1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,

merge ():

-1, 0, 1, 2,

merge_sort ():

-1, 0, 2, 3, 4, 5,

Saturday, June 21, 2014

Permutations

Permutations

Combinations are selections that disregard order. The powerset P(S) of a set S is the set of all possible combinations of the n elements of S. There are 2n of these. Permutations are arrangements of sequences into orders. The number of permutations of a set is denoted n! For example, the set {x, y, z} has 6 permutations : {x, y, z}, {x, z, y}, {y, x, z}, {y, z, x}, {z, y, x} and {z, x, y}. Suppose f exists that for given k provides all permutations of S that start with sk. All permutations could then be found by f with k ranging over [0, N - 1]. But f trivially exists as it can be computed by prepending sk to the permutations of the smaller set S - { sk }. Here are two programs to compute the permutations of a set based on this idea.

Thursday, June 19, 2014

Powerset

Powerset

If S is a set, then P(S), the 'powerset' of S is the set of all subsets of S including the empty set and S itself. If S has cardinality N then the cardinality of P(S) is 2N (why?). That is, there are 2N subsets associated with S.

Here's a function to compute P(S) in OCaml. It's an instance of the 'divide and conquer' strategy of problem solving.

  let rec sets l =
    match l with
    | [] -> [[]]
    | x :: xs -> let l = sets xs in 
                   l @ (List.map (fun y -> x :: y) l)

This program translates to C++ naturally and with brevity thanks to C++ 11 lambdas.

#include <boost/utility.hpp>

#include <set>
#include <iterator>
#include <algorithm>

template <class I, class D>
D sets (I begin, I end, D dst)
{
  typedef typename std::iterator_traits<I>::value_type value_type;
  typedef std::set<value_type> set_t;

  if (begin == end)
    {
      *dst++ = set_t (); //the empty set
    }
  else
    {
      std::set<set_t> l;
      std::set<set_t>::iterator back=l.end ();
      sets (boost::next (begin), end, std::inserter (l, back));

      std::transform (l.begin (), l.end (), dst, 
        [](set_t const& s) -> set_t const& { return s; });
      std::transform (l.begin (), l.end (), dst, 
        [=](set_t s) -> set_t { s.insert (*begin); return s; });
    }

  return dst;
}

Thursday, June 12, 2014

Cartesian product

Cartesian product

For two sets $A$ and $B$, the Cartesian product denoted $A \times B$ is defined as the set of all ordered pairs $(a, b)$ where $a \in A$ and $b \in B$. The obvious algorithm to compute a Cartesian product in OCaml can is simple!
let prod l r =
  let g acc a =
    let f acc x =
      (a, x) :: acc
    in List.fold_left f acc r
  in List.fold_left g [] l |> List.rev

A straight-forward translation of the above program into C++ yields this.

#include <boost/range.hpp>
#include <numeric>
#include <iterator>

template <class T>
struct _inner
{
  T a;
  _inner (T a) : a (a) {}
  template <class ItT>
    ItT operator ()(ItT acc, T const& x) const {
    return *acc++ = std::make_pair (a, x);
  }
};
template <class T> 
  _inner<T> inner (T a) { return _inner<T> (a); }

template <class RangeT>
struct _outer
{
  RangeT r;
  _outer (RangeT r) : r (r){}
  template <class T, class ItT>
    ItT operator ()(ItT acc, T const& a) const {
      return std::accumulate (
        boost::begin (r), boost::end(r), acc, inner (a));
  }
};
template <class RangeT>
  _outer<RangeT> outer (RangeT r) { return _outer<RangeT>(r); }

template <class R1, class R2, class ItT>  
ItT prod (R1 A, R2 B, ItT dst) {
  return std::accumulate (
    boost::begin (A), boost::end (A), dst, outer (B));
}

That's a lot more code than in the original OCaml program. But wait... C++11 lambda expressions to the rescue! We can eliminate most of that 'extra' code recovering the elegance of the original.

template <class R1, class R2, class ItT>  
ItT prod (R1 A, R2 B, ItT dst) {

  typedef ItT iterator;
  typedef boost::range_value<R1>::type alpha;
  typedef boost::range_value<R2>::type  beta;

  return std::accumulate (
   boost::begin (A), boost::end (A), dst, 
    [=] (iterator acc, alpha const& a) { 
      return std::accumulate (
        boost::begin (B), boost::end (B), acc, 
        [=] (ItT acc, beta const& x) {
          return *acc++ = std::make_pair (a, x); });
     });
}

Now, my buddy Juan Alday tells me that we can expect more improvements in C++14 relating to lambdas. I hope to persuade him to write more about that here in the next couple of weeks. Stay tuned!

Update: Juan advises that with C++ 14 'generic' lambdas, we can further get it down to this.

auto prod = [](auto const& A, auto const& B, auto dst) {
   return std::accumulate(std::begin(A), std::end(A), dst,
     [&B](auto& output, auto valA) {
      return std::accumulate(std::begin(B), std::end(B), output,
       [&valA](auto& output, auto valB) { 
         std::get<0>(*output) = std::move(valA); 
         std::get<1>(*output) = std::move(valB)); 
         return ++output;});
     });
 };
That's kind of amazing... There's not even one occurrence of the template keyword in that code!

Thursday, May 15, 2014

Depth first search

Depth first search

Depth first search is an 'elementary graph algorithm'. This is a purely functional formulation.

(*depth-first-search

 "Introduction to Algorithms" - Cormen et. al., 1994

*)
module Char_map = Map.Make (Char)

type graph = (char list) Char_map.t

module type S = sig
  type state
  val string_of_state : state -> string
  val depth_first_search : graph -> state
end

module Dfs : S = struct

  type colors = White|Gray|Black

  type state = {
    d : int Char_map.t ; (*discovery time*)
    f : int Char_map.t ; (*finishing time*)
    pred : char Char_map.t ; (*predecessor*)
    color : colors Char_map.t ; (*vertex colors*)
  }

  let string_of_state {d; f; pred; color} =
    let open Printf in
    let bindings m fmt =
      let b = Char_map.bindings m in
      String.concat ", " (List.map (fun (x,y) -> sprintf fmt x y) b) in
    sprintf " d = {%s}\n f = {%s}\n pred = {%s}\n"
      (bindings d "'%c':'%d'") (bindings f "'%c':'%d'")
      (bindings pred "'%c':'%c'")

  let depth_first_search g =
    let node u (t, {d; f; pred; color}) =
      let rec dfs_visit t u {d; f; pred; color} =
        let edge (t, {d; f; pred; color}) v =
          if Char_map.find v color = White then
            dfs_visit t v {d; f; pred=(Char_map.add v u pred); color}
          else  (t, {d; f; pred; color})
        in
        let t, {d; f; pred; color} =
          let t = t + 1 in
          List.fold_left edge
            (t, {d=Char_map.add u t d; f;
                 pred; color=Char_map.add u Gray color})
            (Char_map.find u g)
        in
        let t = t + 1 in
        t , {d; f=(Char_map.add u t f); pred; color=Char_map.add u Black color}
      in
      if Char_map.find u color = White then dfs_visit t u {d; f; pred; color}
      else (t, {d; f; pred; color})
    in
    let v = List.fold_left (fun acc (x, _) -> x::acc) [] (Char_map.bindings g) in
    let initial_state= 
       {d=Char_map.empty;
        f=Char_map.empty;
        pred=Char_map.empty;
        color=List.fold_right (fun x->Char_map.add x White) v Char_map.empty}
    in
    snd (List.fold_right node v (0, initial_state))

end

(* Test *)

let () =
  let g =
       List.fold_right
          (fun (x, y) -> Char_map.add x y)
          ['u', ['v'; 'x'] ;
           'v',      ['y'] ;
           'w', ['z'; 'y'] ;
           'x',      ['v'] ;
           'y',      ['x'] ;
           'z',      ['z'] ;
          ]
          Char_map.empty
  in
  let s = Dfs.depth_first_search g in
  Printf.printf "%s\n" (Dfs.string_of_state s)

Sunday, April 27, 2014

Being normal

Being normal

Given the sophistication of today's financial markets, it's hard to imagine that until 1973, despite active trading since at least the 17th century, there was no known way of estimating the price of European-style options!

Black, Scholes and Merton changed all that with the development of a mathematical framework of a financial market in their seminal paper "The Pricing of Options and Corporate Liabilities". From that model, one can deduce the celebrated Black-Scholes formula : $p = \theta \cdot \left[ F \cdot \Phi\left( \theta \cdot \left[ \frac{\ln \left( F/K \right)}{\sigma} + \frac{\sigma}{2}\right] \right) - K \cdot \Phi\left( \theta \cdot \left[ \frac{\ln \left( F/K \right)}{\sigma} - \frac{\sigma}{2} \right] \right) \right]$ where $\theta = 1$ for call options and $\theta = -1$ for put options, $F := Se^{\left( r-d \right)T}$, $S$ is the current spot, $K$ is the strike, $r$ is a flat continuously compounded interest rate to maturity, $d$ is a flat continuously compounded dividend rate, $\sigma = \hat{\sigma} \cdot \sqrt{T}$, $\hat{\sigma}$ is the root-mean-square lognormal volatility, $T$ is the time to expiry and $\Phi(\cdot)$ is the standard cumulative normal distribution function.

Now, the only real problem with evaluating the above equation is in estimating the the values of the cumulative distribution function of the normal distribution ($\Phi$). This is a function that is defined by an integral that cannot be expressed in terms of elementary functions and is thus said to be one of a family of special functions.

One approach to obtaining a high quality, cross platform implementation of $\Phi$ in OCaml is to leverage the C++ Boost Math Toolkit library. First the C wrapper (let's write this in a file 'special_functions_c.cpp' (say)).

#include <boost/math/distributions/normal.hpp>

#include <caml/mlvalues.h>
#include <caml/alloc.h>

extern "C" value caml_norm_cdf (value v)
{
  using boost::math::cdf;
  using boost::math::normal;

  return caml_copy_double (cdf (normal (0.0, 1.0), Double_val (v)));
}
Next, the forwarding OCaml function ('special_functions.ml').
external caml_norm_cdf : float -> float = "caml_norm_cdf"

let norm_cdf x = caml_norm_cdf x
Lastly, the module interface ('special_functions_sig.mli').
module type S = sig
    val norm_cdf : float -> float
end

Et voilĂ . With this in hand, the Black-Scholes equation follows easily!

let black_scholes ~spot ~strike ~r ~vol ~t ~cp =
  let open Special_functions in
  let d1 = (log (spot/.strike) +. 
              (r +. 0.5*.(vol*.vol))*.t)/.(vol*.(sqrt t)) in
  let d2 = d1 -. vol*.(sqrt t) 
  in
    cp*.spot*.(norm_cdf (cp*.d1)) -. 
       cp*.strike*.(exp ((~-.r)*.t))*.(norm_cdf (cp*.d2))
Building the OCaml special functions library above follows the same procedure as covered in this blog entry. To wrap things up (and marvel at our handiwork,) here's a little test.
let _ = 
  Printf.printf "%.8f\n" 
    (black_scholes ~spot:42. ~strike:40. ~r:0.1 ~vol:0.2 ~t:0.5 ~cp:1.0) ;
  Printf.printf "%.8f\n" 
    (black_scholes ~spot:42. ~strike:40. ~r:0.1 ~vol:0.2 ~t:0.5 ~cp:(~-.1.0))
I observe the values 4.75942239 for the call and, 0.80859937 for the put, respectively.