Saturday, October 22, 2005

const overloaded arrow operator

I think it is a good idea to have const-overloaded arrow operator in counted pointer idiom though the Coplien's book does not say about it. This is required to "carry forward" the const-ness from the handle object to the body pointer held inside the handle. Counted body idiom is useful when you do not want to add corresponding (mirror) functions in handle class when you add functions in the body class. Handle class can actually be template. (CORBA _var classes?) The arrow operator takes care of "automatic" forwarding.

class String // this is handle
{
...
Stringrep *operator -> () const { return b_; }
private:
Stringrep *b_;
}

class Stringrep // this is body
{
void func (); // a non-const function.
}

main() {
const String s (new Stringrep);
s->func (); // invokes a non-const function of stringrep (body) when handle object is const.
}

In order to prevent this undetected mishap declare vonst-overloaded arrow operators.

class String
{
...
const Stringrep *operator -> () const { return b_; }
Stringrep *operator -> () { return b_; }
private:
Stringrep *b_;
}

Depending upon the const-ness of the handle object, it will invoke the right const-overloaded arrow operator and therefore, const-ness of body pointer comes along with it and therefore, compiler will prevent invocation of non-const member method of the body class using a const handle object. This is important because handle-body are logically the same entity for the client of the abstraction.

Subtle function overloading

Quite often I rediscover my own old posts and learn new things from it. This time I am revisiting the very first post http://cpptruths.blogspot.com/2005_06_19_cpptruths_archive.html
I came up with a puzzle to "entertain" you guys! Predict the output of following program. You know where to look at for the explanation.

const int FIRST_TIME = 1;
template <typename T>
void func (T &)
{
static int var;
++var;
if (FIRST_TIME == var)
cout << "Printed once." << endl;
else
cout << "Printed more than once." << endl;
}
int main(void)
{
int a1[4];
int a2[5];
func (a1);
func (a2);
}

OUTPUT:
Printed once.
Printed once.
!!
I would rather have a static checker to guard me against such subtle things.

Tuesday, October 18, 2005

What's wrong with C++?

I whole heartedly agree with this article!!

What's wrong with C++?
by Bartosz Milewski

src: http://www.relisoft.com/tools/CppCritic.html
--------------------------------------------------
Some time ago NWCPP (Northwest C++ Users Group in Seattle) organized a public panel on the future of C++, with Scott Meyers, Herb Sutter, and Andrei Alexandrescu. I started thinking about C++ and realized that I wasn't that sure any more if C++ was the answer to all my problems. I wanted to ask the panelists some tough questions. But I was there for a big surprise--before I had the opportunity to say anything, they started the criticism of C++ in the earnest--especially Scott.

One of the big issues was the extreme difficulty of parsing C++. Java and C#, both much younger languages, have a multitude of programming tools because it's so easy to parse them. C++ has virtually nothing! The best tool one can get is Microsoft Visual Studio, which is really pathetic in that department (I haven't tried Eclipse). Apparently, VS uses a different (incomplete) parser for its browser than it does for its compiler, and that's probably why it can't deal with namespaces or nested classes. When you search for a definition of a function, you get a long list of possible matches that don't take into account any of the qualifications of the original query. Finding all callers of a method is so unreliable that it's better not to use it. And these are the most basic requirements for an IDE. By the way, the Help engine seems to be using yet another parser.

I talked to John Lykos at one of the conferences, and he told me that he would pay big bucks for a tool that would be able to tell what header files must be included in a given file. That was many years ago and to my knowledge there still isn't such a tool. On the other hand, there are languages in which the programmer doesn't even have to specify include files, so clearly this is not an unsurmountable problem and it's only C++ that makes it virtually impossible.

Complex programming problems require complex languages. An expressive language helps you deal better with the complexity of a problem. I believe there is some proportionality between the complexity of the problem and the complexity of the language. But if a language is disproportionately complex, it starts adding complexity to the problem you're trying to solve instead of reducing it. There are endless examples of unnecessary complexity in C++.

Accidentally, the parsing difficulties of C++ might be the biggest roadblock in its evolution. In principle, changing the syntax of the language shouldn't be difficult, as long as you can provide good translation tools. You can look at syntax as a matter of display, rather than its inherent part. Just like you have pretty printers that format your code, you could have a pretty viewer that shows C++ declarations using Pascal-like syntax. You could then switch between programming using the rationalized syntax and the traditional syntax.

As long as C++ gurus live in the clouds of the Olympus, they won't see the need for this kind of evolution. That's why C++ becomes more and more elitist. In the future, people who do NYT crossword puzzles and the ones who program in C++ will be in the same category.

Very smart people keep writing books with titles that read like "Esoteric Nooks and Crannies of C++", or "More Bizarre Pitfalls of C++". Or puzzles that start with "What's wrong with this code fragment that to all normal people looks perfectly OK?". You don't see such publications in the Java or C# community. C++ is becoming a freak language that's parading its disfigurements in front of mildly disgusted but curiously fascinated audience.
"So you have to put a space between angle brackets? How bizarre!"
"Are you telling me that you can't pass an instance of a locally defined class to an STL algorithm? How curious!"
"Talk to me dirty! Tell me more about name resolution!"
"Pardon my French, Is this specialization or overloading?"
--------------------------
Also see: http://www.alledegodenavnevaroptaget.dk/interview.html

Saturday, October 15, 2005

Memory management idioms

This time lets briefly look at three structural idioms discussed in Coplien's book, Advanced C++ programming styles and idioms.

Handle and Body: Handle and body are logically one entity but physically two. This separation allows handle to be much smaller in size than the body. Handle and body are both classes. Because they are logically same entity, there are several consequences: handle can be passed instead of body wherever body is required (efficient). To be logically same, handle and body needs to have exactly same interface. Any addition to body class interface needs a change in the handle class (maintenance). Handle class delegates all function calls to the body. (extra function call). Handle manages memory and the body manages the abstraction. This type of structure is typically used in reference counting. Though hidden from the client, body class pollutes the global namespace. Important thing to note here is that though, both the classes are in global namespace the instance of the body class is only accesible from within the handle class. body class is all private and handle is a friend of body. Note that both handle and body need to be classes.

Then why put the body class in the global namespace? Lets put it inside the handle class. Call this type of structure Envelope/Letter class idiom.

e.g.
class String_Reprentation { char str[5000]; long count; } (Body)
class String { String_Reprentation *rep; } (Handle)

The problem of mirroring interfaces in the handle and body classes mentioned above can be solved using a cool C++ feature: operator ->. Define an overloaded dereference operator (arrow operator) in the handle class which returns a pointer to body.

String_Representation * String::operator -> ();

Note that most new string operations can be implemented as String_Representation member functions: the handle class String gets these operations automatically throught overloaded arrow operator. Add reference counting to it for more flavor. Call it Counted Pointer idiom! Also note that String_Representation interface can't be private.

More info: http://users.rcn.com/jcoplien/Patterns/C++Idioms/EuroPLoP98.html

Saturday, October 08, 2005

const/volatile integrity violation

This time I am going to point you at two short, interesting articles on const integrity violation which is also applicable to volatile modifier.

Basically it talks about the following feature of C++:

GIVEN
int *i;

const int *p = i; // is allowed
BUT
const int** p = &i; // is not allowed !!
AND
const int*& p = i; // is also not allowed !!

How to fix it?

GIVEN
int *i;

const int *p = i; // is allowed
BUT
const int* const * p = &i; // is allowed !!
AND
const int* const & p = i; // is also allowed !!


FAQ:
http://www.parashift.com/c++-faq-lite/const-correctness.html#faq-18.17
AND
http://www.gimpel.com/html/bugs/bug1561.htm

const-correctness

constness can be considered as addional level of type information and therefore we can overload methods in C++ based on only const properties. const-ness of a function should capture the abstract state of the object and not the physical bit state. Following class has 2 overloaded methods which differ only in the const-ness. Remember, subscript operators, if you need one you need the other.

class Fred { ... };

class MyFredList {
public:
const Fred& operator[] (unsigned index) const; // first
Fred& operator[] (unsigned index); // second
...
};

A const object invokes first method therefore after returning the reference to internal data structure, you can not modify as it is const. A non const object invokes the second memeber function in which you can indeed modify returned Fred object. While returning references to internal data structure either return a const reference or return by value if you don't want it to be modified.

An exhaustive information on const-correctness is here:
http://www.parashift.com/c++-faq-lite/const-correctness.html

Tuesday, October 04, 2005

Always define virtual non-pure methods

The ISO C++ Standard specifies that all virtual methods of a class that are not pure-virtual must be defined and compilers are not bound (by standards) to warn you if you don't follow this rule. Based on this assumption, GCC will only emit the implicitly defined constructors, the assignment operator, the destructor and the virtual table of a class in the translation unit that defines its first such non-inline method.

Therefore, if you fail to define this particular method, the linker complains. In case of gcc and ld (linker on Linux), the linker gives out an error message saying "undefined reference to `vtable for function_name' ". This error message is quite misleading. The solution is to ensure that all virtual methods that are not pure are defined. An exception to this rule is a pure-virtual destructor, which must be defined (empty body) in any case. Ch. 12, [class.dtor]/7.

Saturday, October 01, 2005

The big three and exception safety

Lets see how we can relate "the big three" of C++ (copy constructor, copy assignment operator and destructor) to the levels of expception safety.

1. Constructor (default, copy) should provide basic exception guarantee (no memory leakes)
2. copy assignment operator should provide strong exception guarantee (commit-or-rollback)
3. destructor should provide no-throw guarantee. (should not fail)
4. Containter templates should provide all above + exception neutrality (pass the exception thrown by parameter types).

See some earlier posts on this blog for more info on exception safety levels. Also see http://www.boost.org/more/generic_exception_safety.html

why does std::stack::pop() returns void?

I have atleast 2 good explanations for this apparently counter intuitive way of defining the interface.

1. SGI explanation: http://www.sgi.com/tech/stl/stack.html
One might wonder why pop() returns void, instead of value_type. That is, why must one use top() and pop() to examine and remove the top element, instead of combining the two in a single member function? In fact, there is a good reason for this design. If pop() returned the top element, it would have to return by value rather than by reference: return by reference would create a dangling pointer. Return by value, however, is inefficient: it involves at least one redundant copy constructor call. Since it is impossible for pop() to return a value in such a way as to be both efficient and correct, it is more sensible for it to return no value at all and to require clients to use top() to inspect the value at the top of the stack.

2. std::stack < T > is a template. If pop() returned the top element, it would have to return by value rather than by reference as per the of above explanation. That means, at the caller side it must be copied in an another T type of object. That involves a copy constructor or copy assignment operator call. What if this type T is sophisticated enough and it throws an exception during copy construction or copy assignment? In that case, the rvalue, i.e. the stack top (returned by value) is simply lost and there is no other way to retrieve it from the stack as the stack's pop operation is successfully completed!
----
src: EXCEPTION HANDLING: A FALSE SENSE OF SECURITY by Tom Cargill

Operator new

In C++, if you want to mimic malloc style behavior in
pure C++ way then write

Box *b = (Box *) operator new (sizeof (Box)); // statement 1

By this I mean the constructor of Box will not be invoked
as you expect with malloc. Note that this is NOT equivalent to

Box * b = new Box; // Statement 2

because doing that invokes the constructor.

Statment 1 is know as "operator new"!!
AND
Statment 2 is know as "new operator"!!

You have to match statement 1 by
operator delete(b); // does not invoke destructor
and statement 2 by
delete b; // invokes destructor

Return value optimization

In C++, writing a function with a compound return statement like this
const Rational function (void)
{
....
return Rational (a,b); // statement 1
}

can be more efficient than

const Rational function (void)
{
....
Rational r(a,b);
return r; // statement 2
}

when used in the surrounding context such as

main()
{
Rational c = function (); // initializing c.
}

because compilers can avoid "invisible" creation and
destruction of temporaries when function returns an object
by value. This is known as "return value optimization".
In the optimized assembly code, object c is directly
initialized by statement 1. You save upto 2 temporaries (and
creation/destruction of them). One is the local object
r and other one is created and destroyed when the
function returns.