Tuesday, August 11, 2009

Modifying temporaries

Temporary objects are created and destroyed all the time in a C++ program. A simple example would be a function that returns by value. A temporary is as good as a const object because it makes little sense (usually) to change a temporary object, which is unnamed and has a very short time span. (Note: A temporary can be bound to a const reference in which case the scope of the temporary is the same as that of the reference.) However, as it turns out, in C++ you can change temporaries ... if they are of class type! You can call non-const member functions on a temporary. This is quite similar to binding a temporary to a non-const reference and changing it. Section 3.10.10 in C++ ISO/IEC 14882:1998 standard clearly mentions this exception. There are at least two practical use of such an exception. One is the "Named Parameter" idiom and the other one is the Move Constructor idiom. In case of the named parameter idiom, the member functions might prefer to return the object by non-const reference instead of a by value. Here is an example:

class X
{
public:
int a;
char b;
X() : a(0), b(0) {}
X setA(int i) { a = i; return *this; } // non-const function
X setB(char c) { b = c; return *this; } // non-const function
};

std::ostream & operator << (std::ostream & o, X const & x)
{
o << x.a << " " << x.b;
return o;
}

X createX() // returns X by value.
{
return X();
}

int main (void)
{
// The following code uses the named parameter idiom.
std::cout << createX().setA(10).setB('Z') << std::endl;
}

Saturday, March 21, 2009

LEESA: A new way of typed XML programming in C++

Some of my recent research work has focused on developing a highly generic and reusable library for complex object structure traversal, which is best exemplified by schema driven XML programming. I'm glad to present a research paper called LEESA: Embedding Strategic and XPath-like Object Structure Traversals in C++, which will be published in the proceedings of IFIP Working Conference on Domain Specific Languages (DSL WC), 2009 at Oxford, UK. LEESA stands for Language for Embedded quEry and traverSAl. LEESA has advanced the state-of-the-art of the typed XML programming in standard C++ to a level where many benefits of static type analysis can be maintained while enjoying a succinct syntax similar to that of XPath. Below, a quick motivating example of LEESA that sorts and prints the names of the authors in a XML book catalog is shown.

Catalog() >> Book() >> Author() >> Sort(Author(), LastNameComparator) >> ForEach(Author(), print);

The key thing to be noted here is that it is not a string encoded query. In fact, the C++ compiler checks the compatibility of this expression against the book catalog XML schema at compile-time! LEESA uses Expression templates idiom to achieve this highly intuitive, XPath-like syntax. Overall, LEESA's implementation is an exciting combination of generic programming, operator overloading, expression templates, C++ metaprogramming, Boost MPL, C++0x Concepts, and heck a lot of template hackery to make all the things work together! Interesting details are presented in the paper mentioned above.

The source code of LEESA is available. However, LEESA's current implementation is based on Universal Data Model (UDM 3.2.1) -- a full-fledged code generator for model-driven development that can be used as a XML schema compiler. Other code generators could be used provided they are extended to produce the necessary layers of abstraction described in the paper.

In the upcoming posts, I plan to document some of my experiences of developing LEESA.

Sunday, September 21, 2008

Int-to-type idiom and infinite regress

A new writeup on Int-to-type idiom has been posted on More C++ Idioms wikibook. It is used to achieve static dispatching based on constant integral values. I'll finish the writeup of the idiom here with special attention to how int-to-type idiom can lead to infinite series of type instantiations and why. The idiomatic form of the Int-to-type idiom is given below.

template <int I>
struct Int2Type
{
enum { value = I };
typedef Int2Type<I> type;
typedef Int2Type<I+1> next;
};

The type typedef defined inside the template is the type itself. I.e, Int2Type<10>::type is same as Int2Type<10>. The next typedef gives the following type in order. However, compiler is not required to instantiate the next type unless and until two things happen: (1) an instance of the type is created or (2) an associated type is accessed at compile-time. For example, Int2Type<10> will be instantiated if one the following two things are written.

Int2Type<10> a; // a variable declaration.
typedef Int2Type<10>::type INT10; // accessing an associated type.

For kicks, lets see what happens if we add "::type" in the typedef of next.

template <int I>
struct Int2Type
{
enum { value = I };
typedef Int2Type<I> type;
typedef typename Int2Type<I+1>::type next; // Note change here.
};
int main (void)
{
Int2Type<10> i; // This instantiation will trigger an infinite chain.
}

For each instantiation of the Int2Type template, its "next" type is also instantiated because we are accessing the associated type defined inside the template. This leads to an infinite series of instantiations of the template with no end. Obviously, C++ compiler stops after predefined number of recursive instantiations with an error message. More formally, this problem is also known as infinite regress where the original problem reappears in the solution to the problem.

Tuesday, September 16, 2008

copy elision and copy-and-swap idiom

An updated writeup of the copy-and-swap idiom is now available on the More C++ Idioms wikibook. A comparison of two different styles of assignment operator is shown. First version accepts the parameter as pass-by-const-reference whereas the second version accepts it as pass-by-value. For some classes pass-by-value turns out to be more efficient as a copy of the object is elided when the right hand side is a rvalue.

Sunday, August 31, 2008

linked list using std::pair (infinite regression)

Defining a node of a linked-list using std::pair sounds as simple as drinking a Starbucks's white chocolate mocha. But it really isn't. Give it a try! The constraint is to use std::pair's first or second as a pointer to the structure itself, like in any linked-list's node. As far as I know, it is impossible unless you resort to ugly casts from void pointers. The problem is actually quite well known and gives rise to something known as infinite regress, where the problem you want to solve reappears in the solution to the problem.

typedef std::pair<int, /* Pointer to this pair!! */ > Node;

The closest thing I could come up with is something like the one below.

struct Node : std::pair <int, Node *>
{};

Node n;
n.second = &n; // A cyclic linked-list.

Wednesday, July 09, 2008

return void

I thought it would be interesting to discuss a subtle C/C++ interview question I learned recently. Question is deceptively simple: "Can you write a return statement in a function that returns void?" The answer is "Yes! You can return void!" The following program seems to be a valid C/C++ program.

static void foo (void) { }
static void bar (void) {
return foo(); // Note this return statement.
}
int main (void) {
bar();
return 0;
}

I tested it using gcc4 and VS9. With -ansi and -pedantic compiler options for gcc, it throws just a warning pointing at line #5.

return_void.c:5: warning: return with a value, in function returning void

Although use of such a feature is not clear in a C program, it is particularly useful while using templates. Consider,

template <class T>
T FOO (void) {
return T(); // Default construction
}

template <class T>
T BAR (void) {
return FOO<T>(); // Syntactic consistency. Same for int, void and everything else.
}

int main (void) {
BAR<void>();
}

It suddenly makes sense when templates are in picture. Take home point: Syntactic consistency is of paramount importance for supporting generic programming and writing generic libraries.

Saturday, June 14, 2008

Non-Virtual Interface and the Fragile Base Class Interface Problem

A new writeup on Non-Virtual Interface (NVI) idiom has been written on More C++ Idioms wikibook. In the writeup, I discuss the relationship of NVI with the Fragile Base Class (FBC) problem, which can silently break perfectly good derived classes because of some innocuous changes in the base class.