Monday, March 24, 2014

Why we need compile-time reflection in C++1y

Programs need data. That's a no brainer. Programs are only as good as the data you provide them. Based on what kind of data is consumed, programs can be divided into two broad categories: (1) those that operate on regular data (a file), and (2) those that operate on other programs. The first kind of programs are abundant. Your browser, for instance, is showing you this page--its data. The second kind of programs are more interesting and they are called meta-programs.

Meta-programs need data too. As with the other programs, meta-programs are only as good as the data you provide them. So what do we feed them? ... Well, In C++, more important than 'what' is 'when'. (remember Morpheus?) A C++ program is just a sequence of bits the compiler is trying to understand. So, while the compiler is trying to make sense of your program, most of it gets translated (to assembly) but some of it gets executed. Quite intriguing! We're talking about compile-time meta-programming.

Coming back to the 'what'. We want to be able to feed whatever is available at compile-time: types, members, functions, arguments, namespaces, line numbers, file names, all are a fair game. Less obvious things are relationships among types: convertibility, parent/child, base/derived, container/iterator, friends, and more.

A C++ compiler already has this information but it is not in a form a meta-program can use. So we're in a soup, where we can run programs (at compile-time) but there is no data! So the next question is 'how' do we make the data available to our meta-programs? And that brings me to what I like to call the Curiously Recurring Template Meta-Programming  (CRTMP) pattern.

Curiously Recurring Template Meta-Programming Pattern

The idea is rather general and many have done it successfully before: Make data available to meta-programs without offending the compiler and do something interesting with it.

Let's look at who are the subjects (players) in this pattern. (1) the compiler, (2) the meta-program, and last but not the least is (3) the programmer itself because machines haven't taken over yet and humans still write most of the programs as of today.

The compile-time data must make sense to all three above. Today, C++ programmers, because we don't mind pain, create that data in a form that is understood by the former two. The prime examples are the traits idiom, the type_traits library, and sometimes code generators that parse C++ files and spit out relationships between classes.  For example, LEESA's gen-meta.py script generates typelists (Boost MPL vectors) for classes that contain other classes (think XML data-binding). Effectively it builds a compile-time tree of the XML node types.

When things are not auto generated, we make it palatable to the fellow programmers using macros. To many, macros are as obnoxious as the data they hide/generate but lets move on. There are many examples of super-charged too: Boost SIMD, pre-variadic Boost MPL, smart enumerations, and many more. When macros are used in a clever way (abused!) they really do look like magic. I got a first-hand experience of that while developing the RefleX library.

RefleX is a compile-time reflection-based type modeling in C++ for DDS Topics. It is open-source but you need the RTI Connext DDS to play with it. It essentially transforms your native C/C++ type into a serializable type representation called a TypeObject and marshals your data in what is called a DynamicData object. Note that both, type and data are serialized. There are systems--perhaps many we owe our modern life to--that need to distribute types and data over the network for discovery, interoperability, compatibility, and for other reasons.

Here's an example:


The RTI_ADAPT_STRUCT macro expands to about 120 lines of C++ code, which is primarily reflection information about ShapeType and it can be used at compile-time. It is based on the BOOST_FUSION_ADAPT_STRUCT macro. The macro opens the guts of the specified type to the RefleX library. The meta-programs in RefleX use this "data" to do their business. The reflection information includes member types, member names, enumerations, and other ornaments such as a "key". The point is that the same CRTMP pattern is used to "export" information about a native C++ type.

So, the last two open-source C++ libraries I wrote use the CRTMP pattern: In one, "data" is generated using a Python script and in the other using a macro. CRTMP makes C++ libraries remarkably powerful. The reality is there is nothing novel about it. It is seen everywhere.

The natural step in evolution of a idiom/pattern is first-class language support.  If something is so prevalent, the language itself should absorb it eliminate the crud involved developing and writing CRTMP-based libraries.

That brings us to the main point of this post: Compile-time Reflection. We need it. Period. It's a natural step of evolution from where C++ is now. When available, it will make vast amount of compile-time data available to C++ meta-programs. They will run faster, look nicer, and they will knock your socks off! It is mind boggling what has been achieved using template and preprocessor meta-programming. Compile-time reflection will push it two notches up. So stay tuned for C++1y.

Wednesday, March 12, 2014

Fun with Lambdas: C++14 Style (part 1)

It's common knowledge that Functional Programming is spreading like a wildfire in mainstream languages. Latest promoted languages: Java 8 and C++, both of which now support lambdas. So, let the lambdas begin! and may the fun be ever on your side. The same text is available in slides form on Slideshare. This blog post and the talk/slides are inspired by JSON inventor Douglas Crockford.

Write an Identity function that takes an argument and returns the same argument.
auto Identity = [](auto x) {
  return x;
};
Identity(3); // 3
Write 3 functions add, sub, and mul that take 2 parameters each and return their sum, difference, and product respectively.
auto add = [](auto x, auto y) {
  return x + y;
}; 
auto sub = [](auto x, auto y) {
  return x - y;
};
int mul (int x, int y) {
  return x * y;
};

Write a function, identityf, that takes an argument and returns an inner class object that returns that argument.
auto identityf = [](auto x) {
  class Inner {
    int x;
    public: Inner(int i): x(i) {}
    int operator() () { return x; }
  };
  return Inner(x);
};
identityf(5)(); // 5

Write a function, identityf, that takes an argument and returns a function that returns that argument.
auto identityf = [](auto x) {
  return [=]() { return x; };
};
identityf(5)(); // 5

Lambda != Closure
  • A lambda is just a syntax sugar to define anonymous functions and function objects. 
  • A closure in C++ is a function object which closes over the environment in which it was created. The line #2 above defines a closure that closes over x.
  • A lambda is a syntactic construct (expression), and a closure is a run-time object, an instance of a closure type. 
  • C++ closures do not extend the lifetime of their context. (If you need this use shared_ptr)
Write a function that produces a function that returns values in a range.
auto fromto = [](auto start, auto finish) {    
  return [=]() mutable {      
    if(start < finish)        
      return start++;      
    else        
      throw std::runtime_error(“Complete");    
  };  
};
auto range = fromto(0, 10); 
range(); // 0
range(); // 1
Write a function that adds from two invocations.
auto addf = [](auto x) {
  return [=](auto y) { 
    return x+y; 
  };
};
addf(5)(4); // 9
Write a function swap that swaps the arguments of a binary function.
auto swap =[](auto binary) {
  return [=](auto x, auto y) {
    return binary(y, x);
  };
};
swap(sub)(3, 2); // -1
Write a function twice that takes a binary function and returns a unary function that passes its argument to the binary function twice.
auto twice =[](auto binary) {
  return [=](auto x) {
    return binary(x, x);
  };
};
twice(add)(11); // 22
Write a function that takes a binary function and makes it callable with two invocations.
auto applyf = [](auto binary) {
  return [=](auto x) { 
    return [=](auto y) {
      return binary(x, y); 
    };
  };
};
applyf(mul)(3)(4); // 12
Write a function that takes a function and an argument and returns a function that takes the second argument and applies the function.
auto curry = [](auto binary, auto x) {
  return [=](auto y) { 
    return binary(x, y);
  };
};
curry(mul, 3)(4); // 12
Currying (schönfinkeling)
  • Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument.
  • In lambda calculus functions take a single argument only.
  • Must know Currying to understand Haskell.
  • Currying != Partial function application
Partial Function Application
auto addFour = [](auto a, auto b, 
                  auto c, auto d) {
  return a+b+c+d;
};
auto partial = [](auto func, auto a, auto b) {
  return [=](auto c, auto d) { 
    return func(a, b, c, d);
  };
};
partial(addFour,1,2)(3,4); //10
Without creating a new function show 3 ways to create the inc function.
auto inc = curry(add, 1);
auto inc = addf(1);
auto inc = applyf(add)(1);
Write a function composeu that takes two unary functions and returns a unary function that calls them both.
auto composeu =[](auto f1, auto f2) {
  return [=](auto x) {
    return f2(f1(x));
  };
};
composeu(inc1, curry(mul, 5))(3) // 20
Write a function that returns a function that allows a binary function to be called exactly once.
auto once = [](auto binary) {    
  bool done = false;    
  return [=](auto x, auto y) mutable {
    if(!done) {        
      done = true;        
      return binary(x, y);      
    }      
    else        
      throw std::runtime_error("once!");     
  };  
};
once(add)(3,4); // 7
Write a function that takes a binary function and returns a function that takes two arguments and a callback.
auto binaryc = [](auto binary) {    
  return [=](auto x, auto y, auto callbk) {
   return callbk(binary(x,y));    
  };  
};
binaryc(mul)(5, 6, inc) // 31
binaryc(mul)(5,6,[](int a) { return a+1; });
Write 3 functions:
  • unit – same as Identityf
  • stringify – that stringifies its argument and applies unit to it
  • bind – that takes a result of unit and returns a function that takes a callback and returns the result of callback applied to the result of unit.
auto unit = [](auto x) { 
  return [=]() { return x; };
};
auto stringify = [](auto x) {    
  std::stringstream ss;
  ss << x;
  return unit(ss.str());
};

auto bind = [](auto u) {    
  return [=](auto callback) {
   return callback(u());    
  };  
}; 
Then verify.
std::cout << "Left Identity "  
          << stringify(15)() 
          << "==" 
          << bind(unit(15))(stringify)()
          << std::endl;

std::cout << "Right Identity " 
          << stringify(5)() 
          << "=="
          << bind(stringify(5))(unit)()
          << std::endl;
Why are unit and bind special? Read more about them here.

Sunday, March 09, 2014

Fun with Lambdas: C++14 Style

I am presenting at the SF Bay Area Association of C/C++ Users (ACCU) meetup on Wed, Mar 12th. Topic: Fun with Lambdas: C++14 Style. Slides and the blog will be available here so stay tuned.