Monday, September 16, 2013

21 Ways of Passing Parameters ... Plus One!

You probably already know that there are 21 ways of passing parameters to a C++11 function! In case you lost the count somewhere, lets count one more time: 4 variations of pass-by pointer (i.e., const, volatile, const volatile, and none i.e., no cv-qualifiers), 4 variations of pass-by reference (ditto here), 4 variations of pass-by value (ditto here too), 4 ways of passing an argument as an rvalue reference (ditto here too! But seriously, you don't want to be around people who write cv-qualified pass-by rvalue reference arguments), std::initializer_list (ditto here too! and at this point it is ok to browse away from this page), and finally, pass-by universal reference (more formally, perfect forwarding ... T&& ... I think universal references deserve their own spot in this overcrowded VIP lounge because they are really special and they don't have 4 variations). So that makes it 21.

So you are still with me, hmm... You seem to tolerate me. So I'm going to submit that we need one more! Yes, 22nd way of passing parameters. And if you think the 22nd way should be cv-qualified too ... shame on you!

The Goal

Very simple: Move elements from a std::initializer_list. std::initializer_list provides a very convenient syntax to pass statically fixed number of arguments of the same type to a constructor/function. They are light-weight and easy to use. So just like regular arguments, I would like to steal the guts of the original argument when it is an rvalue.

However, there's catch! The std::initialization_list iterators always point to const elements. Why? Because the compiler could use the static read-only store to construct the elements of the initializer_list. However, I think, most objects will be an exception to that because std::initializer_list could be created using runtime variables and not just literals. For example,  { i, j, k} cannot be stored in a read-only store because i, j, and k are not known till run-time. In fact, the standard says, and I quote: "[ Note: The implementation is free to allocate the array in read-only memory if an explicit array with the same initializer could be so allocated. —end note ]". I think the note says it all. Also, according to cppreference: "The storage for std::initializer_list is unspecified (i.e. it could be automatic, temporary, or static read-only memory, depending on the situation)."

All right, so std::initializer_list iterators do not allow us to move objects out even though we could say with high confidence that the underlying array of the std::initializer_list has automatic storage lifespan. That's the problem.

Now, not all objects that are part of an initializer_list are rvalues, they could also be lvalues and const lvalues. So we need a way to carry the lvalue/rvalue-ness of the original object though the std::initializer_list. Let me give an example of what I mean.
void attempt_move(std::initializer_list<std::string> list) {
  // Copy lvalues and move rvalues in the list
}
int main(int argc, char *argv[])
{
  std::string s(argv[0]);
  if(argc >= 4)  
    attempt_move({ s, argv[1], argv[2], argv[3] }); 

  {
    // The underlying array initialization is roughly equivalent to
    std::string __list[4] = { s, 
                              std::string{argv[1]}, 
                              std::string{argv[2]}, 
                              std::string{argv[3]} };
    attempt_move(std::initializer_list<std::string>(__list, __list + 4));
    // __list[0] is an lvalue and others are rvalues
  }
}
The initializer_list of the attempt_move function contains both lvalues and rvalues. Therefore, we can't simply const_cast and move initializer_list iterators to get to the underlying std::string objects. It is dangerous because you might move an lvalue. In general, it is impossible to know which element is rvalue and which is not.

Or is it?

I think we turn out to be a bit lucky here. I've seen just the type we need here. So here I am shamelessly reproducing the original code.

The in<T> idiom
template <typename T> class in;
template <typename T> struct is_in { 
  static const bool value = false;
};
template <typename T> struct is_in<in<T>> { 
  static const bool value = true;
};
template <typename T> struct is_in<const in<T>> { 
  static const bool value = true;
};

template <typename T> 
class in
{
public:
  in (const T& l): v_ (l), rv_ (false) {}
  in (T&& r): v_ (r), rv_ (true) {}

  // Support for implicit conversion via perfect forwarding.
  //
  struct storage
  {
    storage (): created (false) {}
    ~storage () {
      if (created) reinterpret_cast<T*> (&data)->~T ();
    }

    bool created;
    typename std::aligned_storage<sizeof(T), alignof(T)>::type data;
  };

  template <typename T1,
            typename std::enable_if<
              std::is_convertible<T1, T>::value &&
              !is_in<typename std::remove_reference<T1>::type>::value,
              int>::type = 0>
  in (T1&& x, storage&& s = storage ())
      : v_ (*new (&s.data) T (std::forward<T1>(x))), 
        rv_ (true) 
  {s.created = true;}

  in (T& l): v_ (l), rv_ (false) {} // For T1&& becoming T1&.

  // Accessors.
  //
  bool lvalue () const {return !rv_;}
  bool rvalue () const {return rv_;}

  operator const T& () const {return v_;}
  const T& get () const {return v_;}
  T&& rget () const {return std::move (const_cast<T&> (v_));}

  // Return a copy if lvalue.
  //
  T move () const
  {
    if (rv_)
      return rget ();
    else
      return v_;
  }

private:
  const T& v_;
  bool rv_;
};

Except for the middle section of the in<T> class above, rest of the in class is pretty straight forward. It maintains state regarding how it was initialized. Three constructors in(const T&), in(T&&), and in(T &) create a reference to the original object and also maintain a boolean value to remember lvalue/rvalue-ness. in(T&&) is called only when the parameter is an rvalue and other constructors are called depending upon whether the passed argument is a const lvalue or non-const lvalue. We don't distinguish between the two because both cases are lvalues.

Lets see the in<T> idiom in action!
std::vector<std::string> 
attempt_move(std::initializer_list<in<std::string>> list) 
{
  std::vector<std::string> result;
  result.reserve(list.size());
  for(const in<std::string> & s : list)
  {
     if(s.rvalue()) {
        // rvalue string. Just move. 
        // in<T>.rget() returns T&&
        result.push_back(s.rget());
     }
     else {
        // lvalue string. Make a copy
        // in<T>.get() returns const T &.
        result.emplace_back(s.get());
     }
  }
  return result;
}

int main(int argc, char *argv[])
{
  std::string s(argv[0]);
  if(argc >= 4)  
    attempt_move({ s, argv[1], argv[2], argv[3] }); 
}

Using in<T> is simple. You simply ask if the object it is referring to is an lvalue or rvalue and take the appropriate action. That's what attempt_move is doing in a loop. When the referred object is an rvalue, in.rget() returns an rvalue reference, which can be readily passed to any function that is move-aware. For instance, vector.push_back.

Note that the loop variable s is a reference to const in<string>. It is required to be that way. The range-based for loop in attempt_move function is basically using the intializer_list iterators. They always point to const T. The reason is that compilers may use static read-only memory for storing the underlying array of an initializer_list.

In my view, const iterators of initializer_list are too limiting. The underlying in<std::string> array cannot possibly live in a read-only store because, AFAIK, read-only store must be initialized before main and the argv parameters are not known at that time. In fact, the C++11 standard says: "[ Note: The implementation is free to allocate the array in read-only memory if an explicit array with the same initializer could be so allocated. —end note ]". Seems like this is one case where the "explicit array with the same initializer" cannot be allocated read-only. So it probably has automatic storage.

It seems like because of the const iterators of std::initializer_list there are lost opportunities to move the std::string objects.

However, in<T> saves the day! We never modify the in<T> objects. We only query them. All member functions of in<T> are const.

Details

The middle section of in<T> is somewhat nifty. In fact, the earlier program would not compile without it. My discussion about it is going to be brief.

in<T> allows seamless conversion of c-strings and string literals to in<std::string>. It allows such a conversion only if the underlying type--std::string in this case--allows it. It creates an aligned storage and uses placement new to create the rvalue of T. Note that when an in<T> object is created while calling a function (that is the only use of in<T>), the in<T>::storage rvalue created in the constructor lives as long as the in<T> object lives. To be precise, "a temporary bound to a reference parameter in a function call persists until the completion of the full-expression containing the call." Ordinarily, if the in<T> object is created outside a function calling context, the temporary object (i.e., the in<T>::storage object) will be destroyed right after the constructor returns. That will leave the in<T> object in a dangling state, which will most likely lead to undefined behavior or program crash sometime later. Extending lifetimes in function-call context is a safety feature of C++ object lifetimes because the function almost always makes use of the temporary object. In fact, the lifetime is extended till end of the entire expression that contains the function call. That's important because nested function calls may pass a reference to the original temporary instance as return value to other function.

More Uses

The in<T> has more uses beyond just the std::initializer_list. It could be used directly to efficiently pass lvalue/rvalue arguments to functions and query them at run-time from within the function body. You don't have tilt your head to figure the most optimal strategy to pass arguments to a function: rvalue references, const lvalue references, or pass-by value. This a rather surprising side effect. Lets see some code.
struct matrix {
  matrix () {}
  matrix (matrix const&) {cout << "copy-ctor, ";}
  matrix (matrix&&) {cout << "move-ctor, ";}
  matrix & operator = (matrix const &) { cout << "copy-assign, "; return *this; }
  matrix & operator = (matrix &&) { cout << "move-assign, "; return *this; }

  matrix& operator+= (const matrix &m) {
    return *this;
  }
  std::vector<int> data;
};

#ifdef IN
inline matrix operator+ (in<matrix> x, in<matrix> y) {
  return std::move (
    x.rvalue () ? x.rget () += y :
    y.rvalue () ? y.rget () += x :
    matrix (x) += y);
}
#else
inline matrix operator+ (matrix result, matrix const& y) {
  result += y;
  return result;
}
#endif

int main()
{
    matrix s1;
    matrix s2;
    matrix m1 (matrix() + s1);        cout << endl;// (1)
    matrix m2 (matrix() + matrix());  cout << endl;// (2)
    matrix m3 (s1 + s2);              cout << endl;// (3) 
    matrix m4 (s1 + matrix());        cout << endl;// (4)
}
The code snippet above shows two implementations of operator+(). Both are extremely optimized implementations in their own right. The in<matrix> version simply uses pass-by-value and determines whether the parameter is rvalue or not at runtime. The second version is pretty nifty too. It takes the first matrix parameter by value because it has to create one copy anyhow. I found that in<matrix> version is about 1-2% faster than the one below. Hardly any difference! That's because in<T> is better in some respects but not all. To be precise, case #4 in main is optimized better using in<matrix> because it detects the right hand side rvalue. In other cases it is either equivalent or only slightly inferior. After all, the in<matrix> version has two conditionals. Not to mention the overall implementation looks more complex.

I think the opreator+() is not the best example for in<T>. It shines best when there are more parameters. With increasing number of parameters you have to resort to either using the universal references or exponential number of versions of your function--each optimized for a unique combination of arguments. In such cases in<T> may be a quick way out instead of worrying about using the 21 other ways of passing arguments. Having said that, I will encourage every self-respecting programmer to read the guidelines given by the original author of the in<T> idiom. I brought this idiom up once again because I thought it solves the problem of moving rvalues through a std::initializer_list quite elegantly. So that makes it the 22nd alternative!

8 comments:

Jared Grubb said...

There's one way to pass by value, not four. The 'const' and 'volatile' qualifiers dont mean anything on a plain value. If you dont believe me: "struct Foo { void bar(int); }; void Foo::bar(const volatile int) {}" compiles just fine! The cv-qualifier applies to how the local variable inside the function behaves, but has nothing to do with how the value gets passed *TO* the function. So, I dont think these count as four ways to pass-by-value... maybe four ways to receive-by-value :)

Sumant Tambe said...

interesting!!

Marek Knápek said...

inline matrix operator+(/*blah*/ x, /*blah*/ y) {
return std::move(/*blah*/);
}


seems wrong to me, I think return by value is sufficient, no need to "return by move".

Anonymous said...

Interesting

pal said...

you are wrong on lifetime of constructor args:

[23:33:17 pal@localhost ~/tmp/1]$ cat a.cpp
#include
class A {
public:
A() {
std :: cout << "A()\n";
}
~A(){
std :: cout << "~A()\n";
}
};
class B {
public:
B(A&&a=A()) {
std :: cout << "B()\n";
}
~B(){
std :: cout << "~B()\n";
}
};
int main ( ) {
B b;
std :: cout << "main\n";
}

[23:33:33 pal@localhost ~/tmp/1]$ g++ -std=c++11 a.cpp -o a && ./a
A()
B()
~A()
main
~B()

Sumant Tambe said...

@pal I agree with your comment. My description regarding lifetime extension was misleading. I've updated it. Please take a look.

Lifetime extension of temporary objects happen only in function call context. If you create an ordinary object on stack like you do in main, the rule does not apply and the temporary is destroyed right away. However, in the following example the lifetime of the temporary is extended till the end of function zoo.

Modify the B constructor like this:
B(int i, A&&a=A()) { ... }

int foo(B) { return 10; }
void zoo(int) { }
int main() { zoo(foo(1);) }

Output:
A()
B()
~B()
~A()

pal said...

i have a different explanation. lifetime extension happens till the end of life of bound reference (a arg) and reference only lives till the end of block, in our case - function body, be it foo(), or B:B(). but it doesn't apply here, because temporary is created outside of function body, at the callsite and it lives like every other till the end of expression. your expression is zoo(foo(1)), my expression is construction of b.
i.e. if you save your initialization_list to variable, and then pass it to function, your code will break, but it is fine as it is

Schlüsseldienst in Köln said...

Thanks a lot :) It was very helpful :)