Sunday, June 03, 2012

Perfect Forwarding of Parameter Groups in C++11

C++11 offers a great new feature called Perfect Forwarding, which is enabled via the use of rvalue references. Perfect forwarding allows a function template to pass its arguments through to another function while retaining the original lvalue/rvalue nature of the function arguments. It avoids unnecessary copying and avoids the programmer having to write multiple overloads for different combinations of lvalue and rvalue references. A class can use perfect forwarding with variadic templates to "export" all possible constructors of a member object at the parent's level.
class Blob
{
  std::vector<std::string> _v;
public:

  template<typename... Args>
  Blob(Args&&... args) 
   : _v(std::forward<Args>(args)...)
  {  }

};

int main(void)
{
  const char * shapes[3] = { "Circle", "Triangle", "Square" };

  Blob b1(5, "C++ Truths");  
  Blob b2(shapes, shapes+3);
}
The Blob class above contains a std::vector, which has many different constructors. Using a perfect forwarding constructor, the Blob class allows its clients to pass variable number of parameters that different constructors of std::vector would use. For example, object b1 uses std::vector's constructor that takes a count and an initial value where as object b2 uses the constructor that takes an iterator pair. Also note that std::forward allows us to perfect-forward the parameters.

[Note: Strictly speaking, this specific example does not require perfect forwarding because Blob constructor could take a std::vector by value and move it into the member vector object. But lets carry on!]

In practice, however, you would encounter classes that would have several members. Naturally, you may want to "export" their different constructors at its parent's level. This is particularly true if object construction is heavy and does not support efficient move. The motivation behind this is no different than that of the emplace operations on containers. Just like the standard emplace operations, our perfect forwarding constructor allows us to construct an object in-place. We don't even pay the cost of a move (which is often negligible for types such as std::vector). An interesting problem arises when we want to perfect forward arguments to more than one constructor. How do we decide which parameters are for which object? For example, the following Blob class has a vector of strings and a list of doubles. How do we group the parameters to the respective constructors? How do we know the programmer's intent?
class Blob
{
  std::vector<std::string> _v;
  std::list<double> _l;
public:

  template<typename... Args>
  Blob(Args&&... args) 
   : _v(???)
     _l(???)
  {  }
};

int main(void)
{
   Blob b3(5, 10, 10.0); // Three parameters and two constructors. How do we group?
}
Fortunately there is a way out and a very interesting one.

First of all, we've to group the parameters. An obvious candidate comes to mind: std::tuple! We could ask programmers to group the parameters in a sequence of tuples and each parameter grouped in the tuples are passed to the respective constructors.
 
Blob b3(std::make_tuple(5), std::make_tuple(10, 99.99)); 
 
The intent of the programmer is pretty clear in the above groupings. The vector will have 5 empty strings and the list will have 10 doubles each initialized to value = 99.99. However, we've a new problem now. std::vector and std::list do not accept std::tuple as parameters. We've to ungroup them before calling their constructors. This is far from trivial but it also makes the whole thing very interesting.

Note that retrieving values from a std::tuple requires calling std::get<i> where i is a compile-time constant that varies from 0 to the tuple_length-1. Somehow, we've to call std::get on each tuple with increasing values of i. To do that, we've to create a compile-time list of tuple indices. Here is a way to do it. Thanks sigidagi!
template<unsigned...> struct index_tuple{};

template<unsigned I, typename IndexTuple, typename... Types>
struct make_indices_impl;

template<unsigned I, unsigned... Indices, typename T, typename... Types>
struct make_indices_impl<I, index_tuple<Indices...>, T, Types...>
{
  typedef typename 
    make_indices_impl<I + 1, 
                      index_tuple<Indices..., I>, 
                      Types...>::type type;
};

template<unsigned I, unsigned... Indices>
struct make_indices_impl<I, index_tuple<Indices...> >
{
  typedef index_tuple<Indices...> type;
};

template<typename... Types>
struct make_indices 
  : make_indices_impl<0, index_tuple<>, Types...>
{};
make_indices is a collection of recursive meta-fuctions that compute a list of indices given a tuple type. For example, if you have (42, true, 1.2), which is tuple<int, bool, double>, make_indices<int,bool,double>::type gives index_tuple<0, 1, 2>. Here is how to use make_indices in a simple program.
template <unsigned... Indices, class... Args, class Ret>
Ret forward_impl(index_tuple<Indices...>,
                 std::tuple<Args...> tuple,
                 Ret (*fptr) (Args...))
{
  return fptr(std::get<Indices>(tuple)...);
}

template<class... Args, class Ret>
Ret forward(std::tuple<Args...> tuple, Ret (*fptr) (Args...))
{
   typedef typename make_indices<Args...>::type Indices;
   return forward_impl(Indices(), tuple, fptr);
}

int myfunc(int i, bool, double)
{
  return 5 + i;
}

int main()
{
  std::cout << forward(std::make_tuple(42, true, 1.2), myfunc) << std::endl;
}
Line #6 above is the place where the actual function is called where the list of indices and the tuple come together and two parameter packs are expanded in lockstep to yield the complete list of parameters. Note that we're not using perfect forwarding in this case. Moreover, tuple is also copied by value. That's fine for scalar builtin types like int, bool, and double. For large types, however, unnecessary copies may be created. The above program is simplified for the purpose of illustration.

Back to class Blob

The make_indices utility is pretty clever. But we're not done yet. We've to achieve the same effect while calling the member constructors from the initialization list. We've to compute two lists of indices (one for vector and one for list) and expand them with the respective tuples. The question is how do we compute the list of indices before calling member constructors?

Delegated constructors come to rescue!

class Blob
{
  std::vector<std::string> _v;
  std::list<double> _l;

public:

  template <typename... Args1,
            typename... Args2>
  Blob(std::tuple<Args1...> tuple1,
       std::tuple<Args2...> tuple2)
  : Blob(tuple1,
         tuple2,
         typename make_indices<Args1...>::type(),
         typename make_indices<Args2...>::type())
  {}

private:
  template <typename... Args1,
            typename... Args2,
            unsigned... Indices1,
            unsigned... Indices2>
  Blob(std::tuple<Args1...> tuple1,
       std::tuple<Args2...> tuple2,
       index_tuple<Indices1...>,
       index_tuple<Indices2...>)
    : _v(std::forward<Args1>(std::get<Indices1>(tuple1))...),
      _l(std::forward<Args2>(std::get<Indices2>(tuple2))...)
  { }
};
The new Blob class has a public constructor that delegates to a private constructor that expects not only the tuples but also the list of indices. The private constructor is not only templatized on the tuple arguments but also on the list of indices. Once we've the tuples and lists of indices together, passing them to the member constructors is pretty straight forward. We simply expand the two variadic lists in unison.

Astute readers will likely notice that the Blob class is no longer using perfect forwarding because it accepts two tuples by value as opposed to an argument list in a perfect forwarding fashion shown at the beginning. That's is on purpose. And it is not any less efficient either! Even for large types! How's that possible?

Well, who says copying a tuple by value means copying its original parameters by value? C++ standard library provides a very convenient helper called forward_as_tuple, which perfect-forwards its parameters types as tuple members. It constructs a tuple object with rvalue references to the elements in arguments suitable to be forwarded as argument to a function. That is, the tuple member types are T&& and copying references is blazing fast. So we can afford to pass the tuples by value. Alternatively, we could use rvalue references to the tuples because in this case they are always (!) created by calling std::forward_as_tuple, which returns a temporary tuple. Here is how we use our final Blob class.

I compiled this program on g++ 4.8. Clang 3.2 with the latest libcxx did not work for me.
int main(void) 
{
  Blob b3(std::forward_as_tuple(5, "C++ Truths"),
          std::forward_as_tuple(10, 99.99));
  // b3._v has 5 strings initialized to "C++ Truths" and
  // b3._l has 10 doubles initialized to 99.99

  Blob b4(std::forward_as_tuple(5),
          std::forward_as_tuple(10, 99.99));
  // b4._v has 5 empty strings and
  // b4._l has 10 doubles initialized to 99.99

  Blob b5(std::forward_as_tuple(),
          std::forward_as_tuple(10, 99.99));
  // b5._v is an empty vector 
  // b5._l has 10 doubles initialized to 99.99
}

Using std::piecewise_construct

The Blob class looks fine and dandy so far. But the use of forward_as_tuple looks somewhat weird and does not say what it is for: in-place construction of a Blob object from its pieces. So in the final installment of the Blob class, we say what we mean. We just decorate the class with a standard (!) dummy class called std::piecewise_construct_t. The public constructor of the Blob is modified like below. We use std::piecewise_construct, a predefined object of std::piecewise_construct_t, at the call site where we create Blob objects.
#include <utility>
// ...
  template <typename... Args1,
            typename... Args2>
  Blob(std::piecewise_construct_t,
       std::tuple<Args1...> tuple1,
       std::tuple<Args2...> tuple2)
  : Blob(tuple1,
         tuple2,
         typename make_indices<Args1...>::type(),
         typename make_indices<Args2...>::type())
  {}

//...

Blob b3(std::piecewise_construct,
        std::forward_as_tuple(5, "C++ Truths"),
        std::forward_as_tuple(10, 99.99));
Obviously, the C++ library working group anticipated such situations. But are there use cases in the standard library that require the piecewise_construct idiom? As it turns out, the std::map and std::unordered_map face a very similar issue while using emplace operations. Note that std::map and std::underorder_map use a std::pair as their value_type. The pair's .first and .second members must be constructed from a list of perfectly forwarded parameters. Obviously, the question arises what parameters go where. To solve this issue, the parameters of each member's constructor must be wrapped as a tuple and indicate so using std::piecewise_construct so that the right pair constructor (#6) can be invoked.

So on an ending note, perfect forwarding is, well, not perfect; but gets the job done with some help from the programmers!

Comments? Feedback? Post them below in the comments section or check out the previous comments on Reddit.