Monday, August 25, 2014

Terms With Variables (C++)

Terms with Variables (C++)

In this earlier post I showed a nifty OCaml type for modeling terms with variables for problems involving substitutions. I got interested in what it would take to implement the type in C++(03) (doing 'sum' types in C++ elegantly is a perennial problem). It ain't nowhere near as succinct but we got there nonetheless.
#include <list>

#include <boost/variant.hpp>

/*
  type ('a, 'b) term =
    | Term of 'a * ('a, 'b) term list
    | Var of 'b
*/

template <class A, class B> struct term;
template <class B> struct var;

template <class A, class B>
struct make_tree
{
  typedef boost::variant <
    boost::recursive_wrapper<term <A, B> >,
    boost::recursive_wrapper<var<B> > >   type;
};

template <class A, class B>
struct term
{
  typedef typename make_tree <A, B>::type tree;
  A a;
  std::list <tree> children;
  term (A a
    , std::list<tree> const& children)
    :       a (a), children (children)
  {}
};

template <class A, class B> 
inline term <A, B> make_term (
  A a, std::list<typename make_tree<A, B>::type> const& c, B const*)
{
  return term<A, B> (a, c);
}

template <class B> 
struct var
{
  B tag;
  var (B tag) : tag (tag) {}
};

template <class B>
inline var<B> make_var (B tag) { return var<B> (tag); }
For example, this little program builds the term represented by the concrete syntax "a(b(), c)".
int main ()
{
  typedef make_tree<std::string, std::string>::type tree;
  typedef std::list<tree> term_list;
  std::string const* tag_str=(std::string const*)0L;

  // a(b(), c)
  term_list l;
  l.push_back (make_term(std::string("b"), term_list (), tag_str));
  l.push_back (make_var(std::string("c")));
  tree t = make_term(std::string("a"), l, tag_str);
  
  return 0;
}

Saturday, August 9, 2014

Digits

Digits

Counting is a common problem in computer programming. I recently had a need for an algorithm to compute the number of base 10 digits of an integer (for example, there are 3 digits in the number 569). Here's what I worked out.
module type S=sig
    (**Count the number of base 10 digits of an integer*)
    val digits : int -> int 

    (**Prepend with zeroes as necessary to get the 'right'
       number of columns *)
    val digits_10 : int -> int -> string

end

module A : S = struct 
  let digits n =
    let rec loop acc n =
      let k = n / 10 in
      if k = 0 then acc else loop (acc + 1) k
    in loop 1 n

  let digits_10 i n =
    let l = digits n in
    let s = string_of_int (abs n) in
    let maybe_sign = if n < 0 then "-" else "" in
    if l >= i then (maybe_sign^s) 
    else
      let lz=string_of_int 0 in
      let rec aux acc j = 
        if j = 0 then acc 
        else 
          aux (lz^acc) (j - 1) in
      maybe_sign ^ aux s (i - l)
end

module Test_io = struct

  let test_0 () = 
    let print_line i = 
      Printf.printf "There are %d digits in %d\n" (A.digits i) i
    in
    let rec loop i n =
      if i = n then () 
      else
        let x  =
          int_of_float (10. ** (float_of_int i)) in
        print_line x; loop (i + 1) n
    in (*Range over 0 to to 10^19 *) loop 0 19

  let test_1 () =
    let print_line i = 
      Printf.printf "%s\n" i
    in
    let rec loop i n =
      if i = n then () 
      else
        let x = A.digits_10 3 i in
        print_line x; loop (i + 1) n
    in (*Range over 0 to 100, 3 columns*)loop 0 100

  let run () = test_0 (); test_1 ()

end

let () = Test_io.run ()
The program outputs
There are 1 digits in 1
There are 2 digits in 10
There are 3 digits in 100
There are 4 digits in 1000
There are 5 digits in 10000
There are 6 digits in 100000
There are 7 digits in 1000000
There are 8 digits in 10000000
There are 9 digits in 100000000
There are 10 digits in 1000000000
There are 11 digits in 10000000000
There are 12 digits in 100000000000
There are 13 digits in 1000000000000
There are 14 digits in 10000000000000
There are 15 digits in 100000000000000
There are 16 digits in 1000000000000000
There are 17 digits in 10000000000000000
There are 18 digits in 100000000000000000
There are 19 digits in 1000000000000000000
000
001
002
003
004
005
006
 .
 .
 .
097
098
099

Fun, fun, fun!

Friday, July 25, 2014

Cartesian product (redux)

Cartesian product (redux)

In this blog post we looked at programs to compute Cartesian Products. One algorithm (given here in OCaml) if you recall is
let prod l r =
  let g acc a =
    let f acc x =
      (a, x) :: acc
    in List.fold_left f acc r
  in List.fold_left g [] l |> List.rev

In that earlier post I translated the above program into C++. This time I'm doing the same straightforward translation but using the Boost.MPL library to compute such products at compile time. It looks like this:

#include <boost/mpl/pair.hpp>
#include <boost/mpl/vector.hpp>
#include <boost/mpl/push_front.hpp>
#include <boost/mpl/fold.hpp>
#include <boost/mpl/reverse.hpp>
#include <boost/mpl/placeholders.hpp>

using namespace boost::mpl::placeholders;

template <class L, class R>
struct product
{
  struct g {
    template <class AccT, class A>
    struct apply {
       typedef typename boost::mpl::fold <
         R, AccT
        , boost::mpl::push_front<_1, boost::mpl::pair<A, _2> > 
         >::type type;
    };
  };

  typedef typename boost::mpl::reverse <
    typename boost::mpl::fold <
                  L, boost::mpl::vector<>, g>::type>::type type;
};
The translation proceeds almost mechanically! Does it work though? You betcha! Here's a test we can convince ourselves with:
#include <boost/mpl/equal.hpp>
#include <boost/mpl/for_each.hpp>
#include <boost/mpl/int.hpp>

#include <iostream>

using namespace boost::mpl;

typedef vector<int_<1>, int_<2> > A;
typedef vector<int_<3>, int_<4>, int_<5> > B;
typedef product <A, B>::type result;

BOOST_MPL_ASSERT((
  equal<
    result
  , vector<
         pair<int_<1>, int_<3> >
       , pair<int_<1>, int_<4> >
       , pair<int_<1>, int_<5> >
       , pair<int_<2>, int_<3> >
       , pair<int_<2>, int_<4> >
       , pair<int_<2>, int_<5> >
      > 
  > ));

struct print_class_name {
    template <typename T>
    void operator()( T t ) const {
       std::cout << typeid(t).name() << " ";
    }
};

int main ()
{
  boost::mpl::for_each<result>(print_class_name());

  return 0;
}
The takeaway is, learning yourself some functional programming is a great way to improve your template meta-programming fu! (That of course should come as no surprise... )

Friday, July 18, 2014

Merge sort

Merge sort

Here's another divide and conquer algorithm. This one is for sorting a sequence.

Conceptually, a merge sort works like this (see http://en.wikipedia.org/wiki/Merge_sort):

  • Divide the unsorted list into n sub-lists, each containing one element (a list of one element is trivially sorted);
  • Repeatedly merge sublists to produce new sorted sub-lists until there is only one sub-list remaining : this will be the sorted list.

In the following OCaml, we are drawing on inspiration from the the Haskell standard prelude for the part of the algorithm concerned with dividing the unsorted list : functions take, drop and split.

(**{b Merge sort}, an {i O(n log n)} comparison based sorting
   algorithm *)
module type MERGESORT = sig

  (**[take n] applied to a list [xs], returns the prefix of [xs] of
     length [n] or [xs] itself if [n > List.length xs] e.g. [take 2
     [1; 2; 3]] {i = } [[1; 2]]*)
  val take : int -> 'a list -> 'a list

  (**[drop n] applied to a list [xs], returns the suffix of [xs]
     after the first [n] elements or, [[]] if [n > List.length xs]
     e.g. [drop 2 [1; 2; 3]] {i = } [[3]]*)
  val drop : int -> 'a list -> 'a list

  (**[split n xs] is equivalent to [take n xs, drop n xs] e.g.
     [split 2 [1; 2; 3]] {i = } [([1; 2], [3])]*)
  val split : int -> 'a list -> 'a list * 'a list

  (**[merge] given two {b sorted} sequences [xs] and [ys] returns a
     single sorted sequence of the elements of [xs] and [ys]
     e.g. [merge [1; 3] [2; 4]] {i = } [[1; 2; 3; 4]]*)
  val merge : 'a list -> 'a list -> 'a list

  (**[merge_sort] works by splitting a sequence into two parts,
     recursively sorting the two parts and merging the results into
     a single sorted sequence e.g. [merge_sort [1; 2; -1; 0; 3]] 
     {i = } [[-1; 0; 1; 2; 3]]*)
  val merge_sort : 'a list -> 'a list
end

module Merge_sort : MERGESORT = struct

  let rec take k l =
    match (k, l) with
    | n, _ when n <= 0 -> []
    | _, [] -> []
    | n, (x :: xs) -> x :: take (n - 1) xs

  let rec drop k l =
    match (k, l) with
    | n, xs when n <= 0 -> xs
    | _, [] -> []
    | n, (_ :: xs) -> drop (n - 1) xs

  let rec split k l = take k l, drop k l

  let rec merge l m =
    match (l, m) with
    | [], ys -> ys
    | xs, [] -> xs
    | ((x :: xs) as t), ((y :: ys) as s) -> 
      if x <= y then x :: (merge xs s) else y :: (merge t ys)
        
  let rec merge_sort l =
    let i = (List.length l) / 2 in
    if i = 0 then l
    else
      let u, v = split i l in
      let xs, ys = merge_sort u, merge_sort v in
      merge xs ys

end

We can test our little program in the top-level like this:

# let l = Merge_sort.merge_sort [-1; 2; 0; 4; 3; 5];;
val l : int list = [-1; 0; 2; 3; 4; 5]

Here are the functions in C++ phrased as range algorithms.

//"c:/users/fletch/desktop/setenv.cmd"
//cl /EHsc /Femerge.exe /I c:/project/boost_1_55_0 merge.cpp 

#include <boost/next_prior.hpp>
#include <boost/range.hpp>
#include <boost/range/algorithm/copy.hpp>

#include <vector>
#include <cstdlib>

namespace algo
{
  //take

  template <class S, class D>
  D take (std::size_t n, S src, D dst)
  {
    typedef boost::range_iterator<S>::type it_t;
    it_t curr = boost::begin (src), last = boost::end (src);
    if (n <= 0) return dst;
    if (boost::empty (src))  return dst;
    take (n-1, S (boost::next (curr), last), *dst++ = *curr);
    return dst;
  }

  //drop

  template <class S, class D>
  D drop (std::size_t n, S src, D dst)
  {
    typedef boost::range_iterator<S>::type it_t;
    it_t curr = boost::begin (src), last = boost::end (src);
    if (n <= 0) return boost::range::copy (src, dst);
    if (boost::empty (src))return dst;
    drop (n-1, S (boost::next (curr), last), dst);
    return dst;
  }

  //split

  template <class S, class D1, class D2>
  void split (int n, S src, D1 dst1, D2 dst2)
  { take (n, src, dst1); drop (n, src, dst2); }

  //merge

  template <class S1, class S2, class D>
  D merge (S1 src1, S2 src2, D dst)
  {
    typedef boost::range_iterator<S1>::type it1_t;
    typedef boost::range_iterator<S2>::type it2_t;
    typedef std::iterator_traits<it1_t>::value_type value1_t;
    typedef std::iterator_traits<it2_t>::value_type value2_t;
    if (boost::empty (src1)) return boost::copy (src2, dst);
    if (boost::empty (src2)) return boost::copy (src1, dst);
    it1_t curr1 = boost::begin (src1), last1 = boost::end (src1);
    it2_t curr2 = boost::begin (src2), last2 = boost::end (src2);
    value1_t x = *curr1;
    value2_t y = *curr2;
    if (x <= y)
      merge (S1 (boost::next (curr1), last1), src2, *dst++ = x);
    else
      merge (src1, S2 (boost::next (curr2), last2), *dst++ = y);
    return dst;
  }

  //merge_sort

  template <class S, class D>
  D merge_sort (S src, D dst)
  {
    typedef boost::range_iterator<S>::type it_t;
    typedef std::iterator_traits<it_t>::value_type value_t;
    std::size_t i = boost::size (src)/2;
    if (i == 0) return boost::range::copy (src, dst);
    std::vector<value_t> u, v;
    split (i, src, std::back_inserter (u), std::back_inserter (v));
    std::vector<value_t> xs, ys;
    merge_sort (u, std::back_inserter (xs));
    merge_sort (v, std::back_inserter (ys));
    merge (xs, ys, dst);
    return dst;
  }
  
}//namespace algo

The following usage example provides a lightweight test.

#include <boost/assign/list_of.hpp>

#include <utility>
#include <iterator>
#include <iostream>

int main ()
{
  using std::pair;
  using std::make_pair;
  using std::ostream_iterator;
  using boost::assign::list_of;

  int ord[] = {1, 2, 3, 4};

  auto src=make_pair(ord, ord + 4);
  auto dst=ostream_iterator<int>(std::cout, ", ");

  std::cout << "\ntake ():\n";

  algo::take (0u, src, dst); std::cout << std::endl;
  algo::take (1u, src, dst); std::cout << std::endl;
  algo::take (2u, src, dst); std::cout << std::endl;
  algo::take (3u, src, dst); std::cout << std::endl;
  algo::take (4u, src, dst); std::cout << std::endl;
  algo::take (5u, src, dst); std::cout << std::endl;

  std::cout << "\ndrop ():\n";

  algo::drop (5u, src, dst); std::cout << std::endl;
  algo::drop (4u, src, dst); std::cout << std::endl;
  algo::drop (3u, src, dst); std::cout << std::endl;
  algo::drop (2u, src, dst); std::cout << std::endl;
  algo::drop (1u, src, dst); std::cout << std::endl;
  algo::drop (0u, src, dst); std::cout << std::endl;

  std::cout << "\nsplit ():\n\n";

  algo::split (0u, src, dst, dst); std::cout << std::endl;
  algo::split (1u, src, dst, dst); std::cout << std::endl;
  algo::split (2u, src, dst, dst); std::cout << std::endl;
  algo::split (3u, src, dst, dst); std::cout << std::endl;
  algo::split (4u, src, dst, dst); std::cout << std::endl;
  algo::split (5u, src, dst, dst); std::cout << std::endl;

  std::cout << "\nmerge ():\n\n";

  std::vector <int> l=list_of(-1)(2), r=list_of(0)(1);
  algo::merge (l, r, dst); std::cout << std::endl;

  std::cout << "\nmerge_sort ():\n\n";

  int unord[] = {-1, 2, 0, 4, 3, 5};
  algo::merge_sort (make_pair (unord, unord + 6), dst);

  return 0;
}
The above program produces the following output.
take ():

1,
1, 2,
1, 2, 3,
1, 2, 3, 4,
1, 2, 3, 4,

drop ():


4,
3, 4,
2, 3, 4,
1, 2, 3, 4,

split ():

1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,
1, 2, 3, 4,

merge ():

-1, 0, 1, 2,

merge_sort ():

-1, 0, 2, 3, 4, 5,

Saturday, June 21, 2014

Permutations

Permutations

Combinations are selections that disregard order. The powerset P(S) of a set S is the set of all possible combinations of the n elements of S. There are 2n of these. Permutations are arrangements of sequences into orders. The number of permutations of a set is denoted n! For example, the set {x, y, z} has 6 permutations : {x, y, z}, {x, z, y}, {y, x, z}, {y, z, x}, {z, y, x} and {z, x, y}. Suppose f exists that for given k provides all permutations of S that start with sk. All permutations could then be found by f with k ranging over [0, N - 1]. But f trivially exists as it can be computed by prepending sk to the permutations of the smaller set S - { sk }. Here are two programs to compute the permutations of a set based on this idea.

Thursday, June 19, 2014

Powerset

Powerset

If S is a set, then P(S), the 'powerset' of S is the set of all subsets of S including the empty set and S itself. If S has cardinality N then the cardinality of P(S) is 2N (why?). That is, there are 2N subsets associated with S.

Here's a function to compute P(S) in OCaml. It's an instance of the 'divide and conquer' strategy of problem solving.

  let rec sets l =
    match l with
    | [] -> [[]]
    | x :: xs -> let l = sets xs in 
                   l @ (List.map (fun y -> x :: y) l)

This program translates to C++ naturally and with brevity thanks to C++ 11 lambdas.

#include <boost/utility.hpp>

#include <set>
#include <iterator>
#include <algorithm>

template <class I, class D>
D sets (I begin, I end, D dst)
{
  typedef typename std::iterator_traits<I>::value_type value_type;
  typedef std::set<value_type> set_t;

  if (begin == end)
    {
      *dst++ = set_t (); //the empty set
    }
  else
    {
      std::set<set_t> l;
      std::set<set_t>::iterator back=l.end ();
      sets (boost::next (begin), end, std::inserter (l, back));

      std::transform (l.begin (), l.end (), dst, 
        [](set_t const& s) -> set_t const& { return s; });
      std::transform (l.begin (), l.end (), dst, 
        [=](set_t s) -> set_t { s.insert (*begin); return s; });
    }

  return dst;
}

Thursday, June 12, 2014

Cartesian product

Cartesian product

For two sets $A$ and $B$, the Cartesian product denoted $A \times B$ is defined as the set of all ordered pairs $(a, b)$ where $a \in A$ and $b \in B$. The obvious algorithm to compute a Cartesian product in OCaml can is simple!
let prod l r =
  let g acc a =
    let f acc x =
      (a, x) :: acc
    in List.fold_left f acc r
  in List.fold_left g [] l |> List.rev

A straight-forward translation of the above program into C++ yields this.

#include <boost/range.hpp>
#include <numeric>
#include <iterator>

template <class T>
struct _inner
{
  T a;
  _inner (T a) : a (a) {}
  template <class ItT>
    ItT operator ()(ItT acc, T const& x) const {
    return *acc++ = std::make_pair (a, x);
  }
};
template <class T> 
  _inner<T> inner (T a) { return _inner<T> (a); }

template <class RangeT>
struct _outer
{
  RangeT r;
  _outer (RangeT r) : r (r){}
  template <class T, class ItT>
    ItT operator ()(ItT acc, T const& a) const {
      return std::accumulate (
        boost::begin (r), boost::end(r), acc, inner (a));
  }
};
template <class RangeT>
  _outer<RangeT> outer (RangeT r) { return _outer<RangeT>(r); }

template <class R1, class R2, class ItT>  
ItT prod (R1 A, R2 B, ItT dst) {
  return std::accumulate (
    boost::begin (A), boost::end (A), dst, outer (B));
}

That's a lot more code than in the original OCaml program. But wait... C++11 lambda expressions to the rescue! We can eliminate most of that 'extra' code recovering the elegance of the original.

template <class R1, class R2, class ItT>  
ItT prod (R1 A, R2 B, ItT dst) {

  typedef ItT iterator;
  typedef boost::range_value<R1>::type alpha;
  typedef boost::range_value<R2>::type  beta;

  return std::accumulate (
   boost::begin (A), boost::end (A), dst, 
    [=] (iterator acc, alpha const& a) { 
      return std::accumulate (
        boost::begin (B), boost::end (B), acc, 
        [=] (ItT acc, beta const& x) {
          return *acc++ = std::make_pair (a, x); });
     });
}

Now, my buddy Juan Alday tells me that we can expect more improvements in C++14 relating to lambdas. I hope to persuade him to write more about that here in the next couple of weeks. Stay tuned!

Update: Juan advises that with C++ 14 'generic' lambdas, we can further get it down to this.

auto prod = [](auto const& A, auto const& B, auto dst) {
   return std::accumulate(std::begin(A), std::end(A), dst,
     [&B](auto& output, auto valA) {
      return std::accumulate(std::begin(B), std::end(B), output,
       [&valA](auto& output, auto valB) { 
         std::get<0>(*output) = std::move(valA); 
         std::get<1>(*output) = std::move(valB)); 
         return ++output;});
     });
 };
That's kind of amazing... There's not even one occurrence of the template keyword in that code!