C++ Truths

Posts

Perfect Forwarding of Parameter Groups in C++11

C++11 offers a great new feature called Perfect Forwarding , which is enabled via the use of rvalue references. Perfect forwarding allows a function template to pass its arguments through to another function while retaining the original lvalue/rvalue nature of the function arguments. It avoids unnecessary copying and avoids the programmer having to write multiple overloads for different combinations of lvalue and rvalue references. A class can use perfect forwarding with variadic templates to "export" all possible constructors of a member object at the parent's level. class Blob { std::vector<std::string> _v; public: template<typename... Args> Blob(Args&&... args) : _v(std::forward<Args>(args)...) { } }; int main(void) { const char * shapes[3] = { "Circle", "Triangle", "Square" }; Blob b1(5, "C++ Truths"); Blob b2(shapes, shapes+3); } The Blob class above contains a std::vecto...

Rvalue references in constructor: when less is more

I've seen a recurring mistake made by well-versed C++03 programmers when they set out to use rvalue references for the first time. In fact, as it turns out, better you are at C++03, easier it is to fall in the trap of rvalue reference anti-pattern I'm gonna talk about. Consider the following C++03 class: class Book { public: Book(const std::string & title, const std::vector<std::string> & authors, const std::string & pub, size_t pub_day const std::string & pub_month, size_t pub_year) : _title(title), _authors(authors), _publisher(pub), _pub_day(pub_day), _pub_month(pub_month), _pub_year(pub_year) {} // .... // .... private: std::string _title; std::vector<std::string> _authors; std::string _publisher; size_t _pub_day; std::string _pub_month; size_t _pub_year; }; The Book class above is as dull as it can be. Now lets C++11'fy it!...

Array-like access and Iterators for Homogeneous Tuples

Question often comes up whether tuples can have traditional iterators? In general, the answer is: No! They cannot. That's because tuples typically contain different types and traditional iterators, which are modeled after pointers, can not dereference to objects of multiple types. However, homogeneous tuples can have iterators. So I thought it would be a fun exercise to write one. I wonder why one would use a homogeneous tuples instead of just plain arrays. But lets do it anyways because we can. Although this exercise sounds rather naive and unnecessary, I stumbled upon two very interesting topics along the way. A need for new iterator concepts to separate the notions of element access from iterator traversal. Yes! iterators for homogeneous tuples can't be modeled accurately using conventional iterator categories. Don't believe me? Please read on... How inherited constructors may be simulated on compilers that don't support them today. From this point onward a tuple i...

General-purpose Automatic Memoization for Recursive Functions in C++11

Memoization is a widely known optimization technique used primarily to speed up computer programs by having function calls avoid repeating the calculation of results for previously processed inputs. Repeated calculations are avoided by reusing previously computed results, which must be cached such that look-up is faster than recomputing. Consider a simple fibonacci program unsigned long fibonacci(unsigned n) { return (n < 2) ? n : fibonacci(n - 1) + fibonacci(n - 2); } This algorithm is a frustratingly slow way of computing the Nth fibonacci number (N starting at 0). It does a lot of redundant recomputations. But the beauty of this program is that it is really simple. To speed it up without changing its logic significantly, we could use memoization. Using some clever C++11 techniques, it is possible to memoize this function, which looks like below. unsigned long fibonacci(unsigned n) { return (n < 2) ? n : memoized_recursion(fibonacci)(n - 1) + memoized_recurs...

Multi-dimensional arrays in C++11

What new can be said about multi-dimensional arrays in C++? As it turns out, quite a bit! With the advent of C++11, we get new standard library class std::array. We also get new language features, such as template aliases and variadic templates. So I'll talk about interesting ways in which they come together. It all started with a simple question of how to define a multi-dimensional std::array. It is a great example of deceptively simple things. Are the following the two arrays identical except that one is native and the other one is std::array? int native[3][4]; std::array<std::array<int, 3>, 4> arr; No! They are not. In fact, arr is more like an int[4][3]. Note the difference in the array subscripts. The native array is an array of 3 elements where every element is itself an array of 4 integers. 3 rows and 4 columns. If you want a std::array with the same layout, what you really need is: std::array<std::array<int, 4>, 3> arr; That's quite annoying for...

Compile-time regex matcher using constexpr

With my growing constexpr fascination, I thought of using it for something that would be really hard using template meta-programs. How about implementing a compile-time regular expression matcher? Fortunately, a simple regular expression matcher has already been written by Rob Pike . I just rewrote it using constexpr: single return statement in functions, no modifications to the parameters, abundant ternery operators, and recursion. Here we go... constexpr int match_c(const char *regexp, const char *text); constexpr int matchhere_c(const char *regexp, const char *text); constexpr int matchstar_c(int c, const char *regexp, const char *text); constexpr int matchend_c(const char * regexp, const char * text); constexpr int matchend_c(const char * regexp, const char * text) { return matchhere_c(regexp, text) ? 1 : (*text == '\0') ? 0 : matchend_c(regexp, text+1); } constexpr int match_c(const char *regexp, const char *text) { return (regexp[0] == '^') ? matchher...

Want speed? Use constexpr meta-programming!

It's official: C++11 has two meta-programming languages embedded in it! One is based on templates and other one using constexpr . Templates have been extensively used for meta-programming in C++03. C++11 now gives you one more option of writing compile-time meta-programs using constexpr . The capabilities differ, however. The meta-programming language that uses templates was discovered accidently and since then countless techniques have been developed. It is a pure functional language which allows you to manipulate compile-time integral literals and types but not floating point literals. Most people find the syntax of template meta-programming quite abominable because meta-functions must be implemented as structures and nested typedefs. Compile-time performance is also a pain point for this language feature. The generalized constant expressions (constexpr for short) feature allows C++11 compiler to peek into the implementation of a function (even classes) and perform optimization...

BoostCon'11 video on LEESA: Language for Embedded Query and Traversal

BoostCon'11 , held in Aspen, Colorado, was a fantastic conference this year. Not only because I got a chance to present my work on LEESA but also because of the breadth and depth of the topics covered . LEESA, as you may recall, is an embedded language in C++ to simplify XML programming. LEESA's programming model sits on top of the APIs generated by modern XML data binding tools . LEESA gives you XPath-like syntax (wildcards, child-axis, descendant-axis, tuples) to simplify data extraction from an XML object model. I had a privilege to talk at length about LEESA in BoostCon'11. In the 1hr 41 minutes long video, I'm talking everything from why you need LEESA, how its implemented using cool C++ techniques, such as templates, meta-programming, compile-time and run-time performance, and what direction it may take in future. Here are the slides of the presentation.

Generic Transpose using boost tuples

Recently, while working on LEESA I faced an interesting problem of creating a transpose using generic programming. The problem is pretty straight forward. Suppose you are given a tuple of 2 vectors i.e. tuple<vector<int>, vector<long> >. Assuming both the vectors are of the same size, how would create a vector <tuple<int, long> > containing the data from the earlier two vectors? Of course, the size of the tuple may vary. Lets call the function transpose. A brute force solution would be to overload transpose function N times such that each one takes different size of tuple. Here's one that takes a three parameter tuple. template <class C1, class C2, class C3> // Assuming all Cs are STL containers std::vector<tuple<C1::value_type, C2::value_type, C3::value_type> > transpose (tuple<C1, C2, C3> const &) { ... } } The implementation is fairly straight forward for those who have seen tuples in the past. We basically need a loop wi...

Faster meta-programs using gcc 4.5 and C++0x

One of the practical issues with C++ meta-programming is its speed. C++ programs that use heavy meta-programming can be notoriously slow to compile on contemporary compilers. Things are changing, however. Check the following comparison of gcc 4.5 against gcc 4.4.3. The first graph is obtained from a program that creates a binary tree of template instantiations. The x-axis shows the number of instantiations when value of N goes from 8 to 17. I could not build up patience for gcc 4.4.3 beyond 16363 instantiations (N=13). On the other hand, gcc 4.5 does pretty good and its increase in compilation time is indeed linear as mentioned here . Here is the program that creates a binary tree of template instantiations. template <int Depth, int A, typename B> struct Binary { enum { value = 1 + Binary<depth-1, 0, Binary>::value + Binary<depth-1, 1, Binary>::value }; }; template<int a, typename B> struct Binary<0, A, B> { enum { value = 1 }; ...

Modifying temporaries

Temporary objects are created and destroyed all the time in a C++ program. A simple example would be a function that returns by value. A temporary is as good as a const object because it makes little sense (usually) to change a temporary object, which is unnamed and has a very short time span. (Note: A temporary can be bound to a const reference in which case the scope of the temporary is the same as that of the reference.) However, as it turns out, in C++ you can change temporaries ... if they are of class type! You can call non-const member functions on a temporary. This is quite similar to binding a temporary to a non-const reference and changing it. Section 3.10.10 in C++ ISO/IEC 14882:1998 standard clearly mentions this exception. There are at least two practical use of such an exception. One is the " Named Parameter " idiom and the other one is the Move Constructor idiom . In case of the named parameter idiom, the member functions might prefer to return the object by no...

LEESA: A new way of typed XML programming in C++

Some of my recent research work has focused on developing a highly generic and reusable library for complex object structure traversal, which is best exemplified by schema driven XML programming. I'm glad to present a research paper called LEESA: Embedding Strategic and XPath-like Object Structure Traversals in C++ , which will be published in the proceedings of IFIP Working Conference on Domain Specific Languages (DSL WC) , 2009 at Oxford, UK. LEESA stands for L anguage for E mbedded qu E ry and traver SA l. LEESA has advanced the state-of-the-art of the typed XML programming in standard C++ to a level where many benefits of static type analysis can be maintained while enjoying a succinct syntax similar to that of XPath. Below, a quick motivating example of LEESA that sorts and prints the names of the authors in a XML book catalog is shown. Catalog() >> Book() >> Author() >> Sort(Author(), LastNameComparator) >> ForEach(Author(), print); The key thing to ...

Int-to-type idiom and infinite regress

A new writeup on Int-to-type idiom has been posted on More C++ Idioms wikibook. It is used to achieve static dispatching based on constant integral values. I'll finish the writeup of the idiom here with special attention to how int-to-type idiom can lead to infinite series of type instantiations and why. The idiomatic form of the Int-to-type idiom is given below. template <int I> struct Int2Type { enum { value = I }; typedef Int2Type<I> type; typedef Int2Type<I+1> next; }; The type typedef defined inside the template is the type itself. I.e, Int2Type ::type is same as Int2Type . The next typedef gives the following type in order. However, compiler is not required to instantiate the next type unless and until two things happen: (1) an instance of the type is created or (2) an associated type is accessed at compile-time. For example, Int2Type will be instantiated if one the following two things are written. Int2Type<10> a; // a variable declaration...

copy elision and copy-and-swap idiom

An updated writeup of the copy-and-swap idiom is now available on the More C++ Idioms wikibook. A comparison of two different styles of assignment operator is shown. First version accepts the parameter as pass-by-const-reference whereas the second version accepts it as pass-by-value . For some classes pass-by-value turns out to be more efficient as a copy of the object is elided when the right hand side is a rvalue.

linked list using std::pair (infinite regression)

Defining a node of a linked-list using std::pair sounds as simple as drinking a Starbucks's white chocolate mocha. But it really isn't. Give it a try! The constraint is to use std::pair's first or second as a pointer to the structure itself, like in any linked-list's node. As far as I know, it is impossible unless you resort to ugly casts from void pointers. The problem is actually quite well known and gives rise to something known as infinite regress , where the problem you want to solve reappears in the solution to the problem. typedef std::pair<int, /* Pointer to this pair!! */ > Node; The closest thing I could come up with is something like the one below. struct Node : std::pair <int, Node *> {}; Node n; n.second = &n; // A cyclic linked-list.

return void

I thought it would be interesting to discuss a subtle C/C++ interview question I learned recently. Question is deceptively simple: "Can you write a return statement in a function that returns void?" The answer is "Yes! You can return void!" The following program seems to be a valid C/C++ program. static void foo (void) { } static void bar (void) { return foo(); // Note this return statement. } int main (void) { bar(); return 0; } I tested it using gcc4 and VS9. With -ansi and -pedantic compiler options for gcc, it throws just a warning pointing at line #5. return_void.c:5: warning: return with a value, in function returning void Although use of such a feature is not clear in a C program, it is particularly useful while using templates. Consider, template <class T> T FOO (void) { return T(); // Default construction } template <class T> T BAR (void) { return FOO<T>(); // Syntactic consistency. Same for int, void and everything else. } int m...

Non-Virtual Interface and the Fragile Base Class Interface Problem

A new writeup on Non-Virtual Interface (NVI) idiom has been written on More C++ Idioms wikibook. In the writeup, I discuss the relationship of NVI with the Fragile Base Class (FBC) problem, which can silently break perfectly good derived classes because of some innocuous changes in the base class.

RSS Feed for More C++ Idioms Wikibook

The open-content wikibook "More C++ Idioms" has grown quite a bit from it inception. Whenever I get time, I try to add more contents in the book. So I always wanted to track the changes happening in the book in a convenient manner. Today, I put together a RSS feed for the book, which publishes recent changes made to the book. Thanks to the excellent technical support provided by existing wikibookians (if that is a word). The feed is formatted like two column GUI-based diff utilities (kdiff3, Beyond-compare). Frankly, it is not as readable and enjoyable as regular blog entries but at least it will help interested readers to know what new content is being added to the book and how the book making progress towards completion.

Redirecting C++ Truths feed

Dear readers, Thanks for your continued interest in the C++ Truths blog and a steady stream of feedback comments. For quite some time now, I'm interested in learning accurate statistics about the readers of this blog. Recently, I came across, FeedBurner , which seems to be a popular choice for centralized feed processing and maintaining feed analytics such as subscriber count, live hits and many more things. Therefore, I have decided to redirect all the future contents on C++ Truths blog to the CppTruths feed. Existing atom and rss feeds shall be discontinued soon. I encourage you to subscribe to the new feed source and help me find the real subscriber count. In return, I promise to be more regular and frequent in posting more C++ truths, which you love to read!! - Sumant.

Function Template Overload Resolution and Specialization Anomaly

I recently realized that function template overloading and function template specialization can interact with each other in complex ways giving rise to quite surprising C++ programs. Consider, template<class T> // (a) a base template void f( T ) { std::cout << "f(T)\n"; } template<> void f<>(int*) { // (b) an explicit specialization std::cout << "f(int *) specilization\n"; } template<class T> // (c) another, overloads (a) void f( T* ) { std::cout << "f(T *)\n"; } int main (void) { int *p = 0; f(p); } Output of the above program is "f(T *)" (i.e. (c) is invoked in main). Now simply swap the order in which (b) and (c) are written. The output of the program changes! This time it invokes (b) giving output as "f(int *) specilization". The reason behind it is that when the integer full specilization (b) of f is defined in the order shown above, it is a full specialization of (a). But (c)...