Skip to main content

Moving elements from STL containers and std::initializer_list


Lot has been said about effectively using move-semantics, such as return-by-value and pass sink arguments by-value to constructors (and move later). Dust seems to be settling on those issues because I see some evidence building up to support that. Not much has been said (I think) about using move-semantics for a collection of objects, however. I mean moving objects from one type of STL container to another and also moving elements from std::initializer_list.

Moving STL Containers of the same type

This one is simple. STL containers provide the necessary move operations: move-constructor and move-assign operator. So moving an entire vector<T> to another is pretty straight forward. But that is only one of several ways to construct a vector<T>. How about move constructing a std::vector from a vector with custom allocators? Or for that matter, from a std::list or a std::set?

Moving objects from one type of STL container to another

The classic way of copying objects from one type of STL container to another has been the iterator-pair idiom. All STL containers provide an iterator-pair constructor where you can pass the begin/end iterators of the source container. That will copy the objects from the source to the destination.

How about moving them?

As it turns out, our dear C++ standardization committee has already thought of this use-case and has put a solution in place: std::move_iterator. A move_iterator takes any iterator and adapts it such that the lvalue it is refereing to turns into a rvalue reference upon dereferencing. It basically applies std::move on each element in the entire range with a small difference I'll talk about later. Using std::move_iterator is very easy. There is also a helper function std::make_move_iterator.
int main(void)
{
  std::list<std::string> l(10, "C++ Truths");
  std::cout << l.size(); // prints 10
  std::vector<std::string> v(std::make_move_iterator(begin(l)), 
                             std::make_move_iterator(end(l)));
  std::cout << l.size(); // prints 10
  std::cout << v.size(); // prints 10
  
  // prints 10 empty strings separated by #. (i.e,  ##########)
  std::copy(begin(l), 
            end(l), 
            std::ostream_iterator<std::string>(std::cout, "#"); 

  // prints "C++ Truths" 10 times.
  std::copy(begin(v), 
            end(v), 
            std::ostream_iterator<std::string>(std::cout, " "); 
}
Not surprisingly, the size of the list remains unchanged after moving all the elements into the vector. After moving each string, the moved-from string objects are empty (strictly speaking, undefined but valid state). Therefore, when the list is printed again, we get 10 #s. The vector on the hand has the real strings and we see "C++ Truths" 10 times. So far so good. With std::move_iterator we can move elements in vectors, lists to other STL containers. How about std::set? Can we use std::move_iterator on std::set? As it turns out the answer is not that straight forward.

Moving objects from std::set to other STL container

The problem with std::set is that std::set has only const_iterators. I.e., set<T>::iterator and set<T>::const_iterator may not be different types at all. If they are different, you cannot modify the objects pointed to by set<T>::iterator. Why? If it were allowed, breaking std::set invariants will be way too easy. Note that std::set is an associative container and the elements are stored in sorted order. If modifications through set<T>::iterator were allowed, one could change the element in ways that breaks the sorted order. Idiomatic way to replace an element with the same key is to erase it and insert another immediately after that.
char alpha='A';
std::vector<std::string> alpha_vector(26);

// fill the vector with A,B,C,....X,Y,Z.
std::generate(begin(alpha_vector), 
              end(alpha_vector), 
              [alpha]() mutable { return alpha++; }); 

// move all the strings from the vector to the string
// no invariants broken here.
std::set<std::string> alpha_set(std::make_move_iterator(begin(alpha_vector)),
                                std::make_move_iterator(end(alpha_vector)));

alpha_vector.clear();
                                
// move each string back to the vector
for(const std::string & s : alpha_set) {
  alpha_vector.push_back(std::move(const_cast<std::string &>(s)));
}

// prints 26 #s. That means the set has 26 empty strings, which 
// by definitions breaks the invariant of uniqueness.
std::copy(begin(alpha_set),
          end(alpha_set),
          std::ostream_iterator<std::string>(std::cout, "#"));
Moving means modification (in most cases). So what happens when you use std::move_iterator with std::set::iterators? ... The program does not compile. I think this restriction is reasonable in general but too limiting in some scenarios. For instance, some class types have "key" members and other "data" members, which are not part of the key. If the "data" members are large and efficiently movable, it makes sense to copy the key members but move the data members. I can think of a couple of ways to achieve that. So here is some code that conditionally compiles to one or the other.
class Foo
{
  std::string key;
  std::vector<int> data;
public:
  Foo(std::string k, std::vector<int> d)
    : key(std::move(k)),
      data(std::move(d))
  {}
  Foo(const Foo &) = default; 
  Foo & operator = (const Foo &) = default;
#ifdef OPTION1
  Foo(Foo && f) 
    : key(f.key),             // copies key
      data(std::move(f.data)) // moves data
   {}
  Foo & operator = (Foo && f) {
    using std::swap;
    this->key = f.key;
    swap(this->data, f.data);
  }
#endif
#ifdef OPTION2
  Foo(Foo &&) = default;
  Foo & operator = (Foo &&) = default;
  Foo move() { // copy key and move data.
    return { key, std::move(data) };
  }
#endif  
};
Option #1 above has a custom implementation of the move constructor. It copies the key and moves data. The move-assignment operator is also does the same. Both of them leave the "key" part of the rvalue untouched. This allows the Foo objects to be moved out of a std::set without breaking set invariants. If you think the Foo move constructor should rather be a "strict" move (i.e., it must scoop up everything it can) Option #2 might work for you.

In Option #2, the move-constructor and move-assignment operator are default but there is an additional member function called move. I could have also called it move_from_set. The member move function moves *this object into an rvalue leaving behind the key. The data part of *this is moved out. Note that simply returning a Foo rvalue reference won't be right. That is because in option#2 the move-ctor and move-assign are default and they will scoop up everything they can. We don't want that. Lets see how to use these two options when moving members out from a std::set.
std::set<Foo> foo_set;
foo_set.insert(...);
std::vector<Foo> foo_vector;
foo_vector.reserve(foo_set.size());

for(auto iter = begin(foo_set); iter != end(foo_set);)
{
#ifdef OPTION1
  foo_vector.push_back(std::move(const_cast<Foo &>(*iter)));
#else
  foo_vector.push_back(const_cast<Foo &>(*iter).move());
#endif

  foo_set.erase(iter++);
}

The above code is still quite wordy. Mostly because we are not able to use the move_iterators. move_itrators don't work with const iterators. That is because moving from const lvalues is conceptually wrong. I was quite tempted to create const_move_iterator, which simply disregard the const qualifier and make a T&& from a const T &. But I prefer somewhat wordy const_cast because it is explicit and it is easier to search that way.

Note that I'm erasing the set elements as soon as I move them into the vector. I've to make sure that set iterators are no invalidated as a result of calling erase. Therefore, I call iter++, which moves the iterator to the next position before the element is erased from the set. This is one of the times when I prefer calling post-fix increment instead of prefix.

Moving elements from std::initializer_list

Moving from std::initializer_list gets even more interesting because std::initiliazer_list provides only const iterators. As it turns out The in<T> idiom can be used to determine lvalue/rvalue at run-time and make a decision whether to move or not based on that. Here is an example of how to use it.
std::vector<std::string> 
attempt_move(std::initializer_list<in<std::string>> list) 
{
  std::vector<std::string> result;
  result.reserve(list.size());
  for(const in<std::string> & s : list)
  {
     if(s.rvalue()) {
        // rvalue string. Just move. 
        // in<T>.rget() returns T&&
        result.push_back(s.rget());
     }
     else {
        // lvalue string. Make a copy
        // in<T>.get() returns const T &.
        result.emplace_back(s.get());
     }
  }
  return result;
}

int main(int argc, char *argv[])
{
  if(argc >= 4) 
  {
     std::string s0(argv[0]);
     std::string s1(argv[1]);
     attempt_move({ s0, std::move(s1), argv[2], argv[3] }); 
  }
}

Comments

Unknown said…
on the last example, shouldn't it be

s = attempt_move({ argv[1], argv[2], argv[3] });


instead?
Unknown said…
Let me simply answer your question "Unknown": No, it isn't s = attempt_move({argv[1], argv[2], argv[3]});
Reason: Variable "s" is a string but what's the return type of "attempt_move" --std::vector
ciao
Anonymous said…
I really appreciated with your content and completely agree with you, here also I was visit a company website Digital zona.
Anonymous said…
Fact!!! Thanks for sharing complete point…one more company has articles about Webdesign Agenter website
golden slot mobile
gclub
Anonymous said…
Good article but don't assume a string is empty after moving it. You got it right with "...undefined but valid state" but that doesn't mean empty (necessarily). "Undefined" is probably a poor choice of words however (given what it normally means in C++). The standard says "valid but unspecified state".

Popular Content

Unit Testing C++ Templates and Mock Injection Using Traits

Unit testing your template code comes up from time to time. (You test your templates, right?) Some templates are easy to test. No others. Sometimes it's not clear how to about injecting mock code into the template code that's under test. I've seen several reasons why code injection becomes challenging. Here I've outlined some examples below with roughly increasing code injection difficulty. Template accepts a type argument and an object of the same type by reference in constructor Template accepts a type argument. Makes a copy of the constructor argument or simply does not take one Template accepts a type argument and instantiates multiple interrelated templates without virtual functions Lets start with the easy ones. Template accepts a type argument and an object of the same type by reference in constructor This one appears straight-forward because the unit test simply instantiates the template under test with a mock type. Some assertion might be tested in

Covariance and Contravariance in C++ Standard Library

Covariance and Contravariance are concepts that come up often as you go deeper into generic programming. While designing a language that supports parametric polymorphism (e.g., templates in C++, generics in Java, C#), the language designer has a choice between Invariance, Covariance, and Contravariance when dealing with generic types. C++'s choice is "invariance". Let's look at an example. struct Vehicle {}; struct Car : Vehicle {}; std::vector<Vehicle *> vehicles; std::vector<Car *> cars; vehicles = cars; // Does not compile The above program does not compile because C++ templates are invariant. Of course, each time a C++ template is instantiated, the compiler creates a brand new type that uniquely represents that instantiation. Any other type to the same template creates another unique type that has nothing to do with the earlier one. Any two unrelated user-defined types in C++ can't be assigned to each-other by default. You have to provide a

Multi-dimensional arrays in C++11

What new can be said about multi-dimensional arrays in C++? As it turns out, quite a bit! With the advent of C++11, we get new standard library class std::array. We also get new language features, such as template aliases and variadic templates. So I'll talk about interesting ways in which they come together. It all started with a simple question of how to define a multi-dimensional std::array. It is a great example of deceptively simple things. Are the following the two arrays identical except that one is native and the other one is std::array? int native[3][4]; std::array<std::array<int, 3>, 4> arr; No! They are not. In fact, arr is more like an int[4][3]. Note the difference in the array subscripts. The native array is an array of 3 elements where every element is itself an array of 4 integers. 3 rows and 4 columns. If you want a std::array with the same layout, what you really need is: std::array<std::array<int, 4>, 3> arr; That's quite annoying for