Skip to main content

Array-like access and Iterators for Homogeneous Tuples

Question often comes up whether tuples can have traditional iterators? In general, the answer is: No! They cannot. That's because tuples typically contain different types and traditional iterators, which are modeled after pointers, can not dereference to objects of multiple types. However, homogeneous tuples can have iterators. So I thought it would be a fun exercise to write one. I wonder why one would use a homogeneous tuples instead of just plain arrays. But lets do it anyways because we can.

Although this exercise sounds rather naive and unnecessary, I stumbled upon two very interesting topics along the way.

  • A need for new iterator concepts to separate the notions of element access from iterator traversal. Yes! iterators for homogeneous tuples can't be modeled accurately using conventional iterator categories. Don't believe me? Please read on...

  • How inherited constructors may be simulated on compilers that don't support them today.


From this point onward a tuple is assumed to be a homogeneous tuple. First thing we need is a way of accessing tuple's contents using a run-time integer instead of a compile-time integer. We need an adapter that uses a run-time integer index to return the n-th element in a homogeneous tuple. If n is larger than the bounds of the tuple, the adapter will throw a std::out_of_range exception.

template <typename Tuple,
size_t I = std::tuple_size<Tuple>::value-1>
class TupleAt
{
typedef typename std::tuple_element<0, Tuple>::type T;

public:
static T & get(Tuple & tuple, size_t index)
{
return (index == I)? std::get<I>(tuple) : TupleAt<Tuple, I-1>::get(tuple, index);
}
};

template <typename Tuple>
class TupleAt<Tuple, 0>
{
typedef typename std::tuple_element<0, Tuple>::type T;
public:

static T & get(Tuple & tuple, size_t index)
{
if(index == 0)
return std::get<0>(tuple);
else
throw std::out_of_range("Tuple iterator dereferenced out of valid range.");
}
};

The TupleAt template takes the type of the tuple and its size as template parameters. It assumes that the type of the first element is also the type of the rest of the elements in the tuple (i.e. a homogeneous tuple). TupleAt::get function returns a reference to the n-th element in the tuple. It does that using repeated comparisons from the size of the tuple (std::tuple_size<Tuple>::value) down to zero. TupleAt is specialized for zero so that the recursion ends. If the index does not fall in the expected range, std::out_of_range exception is thrown.

Note that this access pattern is linear in complexity. For a tuple of N elements, it may take up to N comparisons to return the right element.

Array-like access to Homogeneous Tuples



I created a tuple_array class to provide array-like access to the elements of the tuple. It uses TupleAt internally.

template <typename... T>
class tuple_array : public std::tuple<T...>
{
typedef std::tuple<T...> Tuple;
typedef typename std::tuple_element<0, Tuple>::type HeadType;
enum { TUPLE_SIZE = std::tuple_size<Tuple>::value };

HeadType * ref_;
size_t last_;

public:
USING(tuple_array, Tuple)
{
ref_ = & TupleAt<Tuple>::get(*this, TUPLE_SIZE-1);
last_ = TUPLE_SIZE-1;
}

HeadType & operator [] (size_t index)
{
if(last_ != index)
{
ref_ = & TupleAt<Tuple>::get(*this, index);
last_ = index;
}
return *ref_;
}
};


Class tuple_array inherits from std::tuple and just provides operator [] function. It always returns a reference to HeadType typedef, which is the type of the first element in the tuple. To improve efficiency, the tuple_array class caches a pointer to the last dereferenced index in the tuple. std::tuple has a zillion constructors to create a tuple. To avoid repeating the constructors in the derived tuple_array class, I wanted to use inherited constructors. However, g++ 4.7 does not support it at this moment. So I'm using a variadic template constructor to mimic the behavior of inherited constructors.

#define USING(Dervied, Base)                                       \
template<typename ...Args, \
typename = typename std::enable_if \
< \
std::is_constructible<Base, Args...>::value \
>::type> \
Dervied(Args &&...args) \
: Base(std::forward<Args>(args)...) \


The inherited constructor trick is captured in a macro, which I stole shamelessly from here. The USING macro defines a variadic template constructor and forwards all the arguments to the underlying base constructor. To avoid being overly greedy, it enables instantiation only if the base is constructible from the given parameters. std::is_constructible<Base, Args...>::value provides a neat way of checking that at compile-time.

Finally, we need a simple function to create the tuple_array. Function make_tuple_array is a factory function to create tuple_arrays from a list of arguments. Note how it uses the uniform initialization syntax without specifying the actual type. Using make_tuple_array is just like an array. However, note that element access is linear and not constant-time.

template <typename... T>
tuple_array<T...> make_tuple_array(T... args)
{
return { std::forward< T&& >(args)... };
}

int main(void)
{
auto ta = make_tuple_array(20, 30, 40);
printf("%d %d %d", ta[0], ta[1], ta[2]); // prints 20 30 40
}


Iterators for Homogeneous Tuples



Now lets turn our attention to iterators.

What category would an iterator for homogeneous tuple belong? Random access? Bidirectional? It appears to me that the homogeneous tuple iterator could simply use an internal index to remember what position it is at and use the TupleAt::get to return the element when dereferenced. Arbitrary arithmetic can be performed in constant-time on the internal index to move the iterator. This indicates that the iterator is a random access iterator.

However, the dereference function is not constant-time as discussed earlier. As a result, it is not a random access iterator. Clearly, traversal is random access but element access is not. Existing iterator categories do not support this distinction. The standard random access iterator [5] requires all operations to be amortized constant time.

What we really need is a way to distinguish between the categories of element access and the categories of traversal. This is precisely the point of new iterator concepts in boost.

For now, we'll just consider that the iterator for homogeneous tuple is a random access iterator. Here is how it looks like with a lot of boilerplate overloaded operators.

template <typename Tuple>
class tuple_iterator
: public std::iterator<std::random_access_iterator_tag,
typename std::tuple_element<0, Tuple>::type>
{
typedef typename std::tuple_element<0, Tuple>::type T;
enum { TUPLE_SIZE = std::tuple_size<Tuple>::value };

Tuple * tuple;
int current_;
int last_;
T * ref_;

T * update_ref()
{
if(current_ != last_)
{
ref_ = & TupleAt<Tuple>::get(*tuple, current_);
last_ = current_;
}
return ref_;
}

public:

typedef int difference_type;

explicit tuple_iterator(Tuple & t, size_t i = TUPLE_SIZE)
: tuple(&t),
current_(i),
last_(-9999),
ref_(nullptr)
{}
T & operator *() {
return *update_ref();
}
T * operator ->() {
return update_ref();
}
T & operator [] (int offset) {
return TupleAt<Tuple>::get(*tuple, current_+offset);
}
tuple_iterator & operator ++ () {
if(current_ < TUPLE_SIZE)
++current_;
return *this;
}
tuple_iterator operator ++ (int) {
tuple_iterator temp(*this);
++(*this);
return temp;
}
tuple_iterator & operator -- () {
if(current_ >= 0)
--current_;
return *this;
}
tuple_iterator operator -- (int) {
tuple_iterator temp(*this);
--(*this);
return temp;
}
tuple_iterator operator - (int i) const {
tuple_iterator temp(*tuple, current_-i);
return temp;
}
tuple_iterator & operator -= (int i) {
current_-=i;
return *this;
}
tuple_iterator operator + (int i) const {
tuple_iterator temp(*tuple, current_+i);
return temp;
}
tuple_iterator & operator += (int i) {
current_+=i;
return *this;
}
difference_type operator - (const tuple_iterator & ti) const {
return current_ - ti.current_;
}
bool operator < (const tuple_iterator &ti) const {
return index < ti.index;
}
bool operator > (const tuple_iterator &ti) const {
return index > ti.index;
}
bool operator <= (const tuple_iterator &ti) const {
return index <= ti.index;
}
bool operator >= (const tuple_iterator &ti) const {
return index >= ti.index;
}
bool operator == (tuple_iterator const & ti) const {
return (tuple == ti.tuple) && (index == ti.index);
}
bool operator != (tuple_iterator const & ti) const {
return !(*this == ti);
}
};

template <>
class tuple_iterator <std::tuple<>>
{
public:
tuple_iterator(std::tuple<>, size_t i = 0) {}
};


The tuple_iterator class provides the usual typedefs (e.g., difference_type, value_type, pointer, reference, and iterator_category) and overloaded operators (e.g., *, ->, [], +, -, +=, -=, -, +, <, >, <=, >=, ==, !=) to support the requirements of random access iterator. Just like tuple_array class it caches a pointer to the last element that was dereferenced. It goes through O(N) comparisons only if the tuple iterator is dereferenced at a different index than what is cached. A specialization of tuple_iterator for empty tuple is also provided. It has no members other than a constructor because there is nothing to dereference to!

Finally, we need a way to create the begin and end iterator from a non-empty tuple. We add the corresponding functions.

template <typename... Args>
tuple_iterator <std::tuple<Args...>> begin(std::tuple <Args...> &t)
{
return tuple_iterator <std::tuple<Args...>>(t, 0);
}

template <typename... Args>
tuple_iterator <std::tuple<Args...>> end(std::tuple <Args...> &t)
{
return tuple_iterator <std::tuple<Args...>>(t);
}


If no index is passed to the iterator constructor, it points to the end of the tuple. The internal index of such an iterator is same as the size of the tuple. An iterator at the beginning has index = 0 -- the first element. Using the iterators is now straightforward. I do not discuss constant and reverse iterators here.

int main(void)
{
auto tuple = std::make_tuple(10, 20, 30, 40);
auto ta = make_tuple_array(4, 2, 1);

std::copy(begin(tuple), end(tuple), std::ostream_iterator<int>(std::cout, " "));
std::sort(begin(ta), end(ta));

for(int i : ta)
{
std::cout << i << std::endl;
}

return 0;
}


I think, this rather naive exercise turned out to be quite interesting. Hopefully, you enjoyed as much as I did. Live code.

Comments

Smernaz said…
I sure would like some kind of iterator concept where data type is not part of it. I've made some crazy 'iterator' to traverse a big data model. Of curse this is different, the traversal need to no the data to find its way. But maybe, if there was some concept the work building it would be simpler.
Rather than using TupleAt for every dereference, I would store a pointer in the iterator, and use TupleAt to retrieve the pointer when incrementing or decrementing the iterator. That way iterator dereference is cheap --- algorithms typically dereference iterators more often than they increment them, and this only costs more in total if they increment the iterator without dereferencing it.
Sumant said…
@Anthony: I think that's a good idea. If I understand you right, the tuple iterator caches a pointer to the last dereferenced element in the tuple. If the iterator is dereferenced at the same position again, we're in luck. We simply return the object pointed by the pointer. In other cases, the internal index is updated as needed without going through O(N) comparisons. I've updated the code to reflect that.
Unknown said…
Really very good explanations.
Unknown said…
In the tuple_iterator constructor, last_ is set to -1 if i == 0, so in that case I think ref_ should be set to nullptr instead of &TupleAt::get(*tuple, last_), else it will crash (tested). With this change, the crash is avoided.
Sumant said…
@Andy You are right. Thanks. I added a fix. Disclaimer: This code is minimally tested.

Popular Content

Unit Testing C++ Templates and Mock Injection Using Traits

Unit testing your template code comes up from time to time. (You test your templates, right?) Some templates are easy to test. No others. Sometimes it's not clear how to about injecting mock code into the template code that's under test. I've seen several reasons why code injection becomes challenging. Here I've outlined some examples below with roughly increasing code injection difficulty. Template accepts a type argument and an object of the same type by reference in constructor Template accepts a type argument. Makes a copy of the constructor argument or simply does not take one Template accepts a type argument and instantiates multiple interrelated templates without virtual functions Lets start with the easy ones. Template accepts a type argument and an object of the same type by reference in constructor This one appears straight-forward because the unit test simply instantiates the template under test with a mock type. Some assertion might be tested in

Covariance and Contravariance in C++ Standard Library

Covariance and Contravariance are concepts that come up often as you go deeper into generic programming. While designing a language that supports parametric polymorphism (e.g., templates in C++, generics in Java, C#), the language designer has a choice between Invariance, Covariance, and Contravariance when dealing with generic types. C++'s choice is "invariance". Let's look at an example. struct Vehicle {}; struct Car : Vehicle {}; std::vector<Vehicle *> vehicles; std::vector<Car *> cars; vehicles = cars; // Does not compile The above program does not compile because C++ templates are invariant. Of course, each time a C++ template is instantiated, the compiler creates a brand new type that uniquely represents that instantiation. Any other type to the same template creates another unique type that has nothing to do with the earlier one. Any two unrelated user-defined types in C++ can't be assigned to each-other by default. You have to provide a

Multi-dimensional arrays in C++11

What new can be said about multi-dimensional arrays in C++? As it turns out, quite a bit! With the advent of C++11, we get new standard library class std::array. We also get new language features, such as template aliases and variadic templates. So I'll talk about interesting ways in which they come together. It all started with a simple question of how to define a multi-dimensional std::array. It is a great example of deceptively simple things. Are the following the two arrays identical except that one is native and the other one is std::array? int native[3][4]; std::array<std::array<int, 3>, 4> arr; No! They are not. In fact, arr is more like an int[4][3]. Note the difference in the array subscripts. The native array is an array of 3 elements where every element is itself an array of 4 integers. 3 rows and 4 columns. If you want a std::array with the same layout, what you really need is: std::array<std::array<int, 4>, 3> arr; That's quite annoying for