It's official: C++11 has two meta-programming languages embedded in it! One is based on templates and other one using constexpr. Templates have been extensively used for meta-programming in C++03. C++11 now gives you one more option of writing compile-time meta-programs using constexpr. The capabilities differ, however.
The meta-programming language that uses templates was discovered accidently and since then countless techniques have been developed. It is a pure functional language which allows you to manipulate compile-time integral literals and types but not floating point literals. Most people find the syntax of template meta-programming quite abominable because meta-functions must be implemented as structures and nested typedefs. Compile-time performance is also a pain point for this language feature.
The generalized constant expressions (constexpr for short) feature allows C++11 compiler to peek into the implementation of a function (even classes) and perform optimizations if the function uses constants (literals) only. Constants can be integral, floating point, as well as string literals. The signature of constexpr functions is just like regular C++ functions but the body has several restrictions, such as only one return statement is allowed. Nevertheless, the syntax of constexpr functions is significantly friendlier than that of template-based meta-functions. Contrary to the design of templates, designers of generalized constant expressions are well-aware of its meta-programming capabilities.
In my view, the most interesting aspect of constexpr is its speed. constexpr functions can perform compile-time computations at lightening speed. To compare the performance I implemented an is_prime algorithm in 3 different ways. Here is the algorithm in regular C++:
Here is the a template version of the same algorithm. Obviously, it is implemented as a collection of meta-functions in terms of structures.
The above meta-program is a recursive implementation because that's how pure functional languages work. The is_prime_impl meta-function does all the heavy lifting. To prevent infinite regression, lazy instantiation technique is used. All I can say at this point is you've to stare at the code to make sense out of it.
And that's precisely the point: C++03 meta-programs are often unreadable and hard to comprehend. Not anymore! Here is the constexpr version of the same algorithm.
Well, I agree, although this version uses friendlier syntax, is not quite easy to read. Just like the previous one, it is recursive. I would say just get used to it! You can't live without recursion in C++ meta-programming world. Secondly, every function has a single return statement and the codifying logic for detecting primality requires abundant use of the ternary operator.
As long as parameter number is an integral constant, this constexpr version will compute the result at compile-time (C++11 compilers only). And when the number is a run-time integer, this same function is perfectly capable of computing the result at run-time. So you don't need two different versions of the same program: one for compile-time and another for run-time. One implementation does it all!
Now that we have the implementations, lets talk performance. Take 4256233. This is 30,000th prime number! How long does the template version take to check if it is prime? It cannot! Yes, g++ 4.7 fails to compile is_prime<4256233>::value because the computation exceeds the default (900) limit of the recursive template instantiations. So I used -ftemplate-depth-2100 allowing 2100 recursive instantiations! Does it work? Yes! How long does it take? About 1 second! That's not bad at all, you say. How fast is constexpr? About 0.154 seconds!
Seems like templates are not too far behind. Wrong! How about fifth million prime number! It is 86028121. I could not get it to work with g++ 4.7. Somewhere in the range of 9300 recursive instantiations g++ seg-faults. That's brutal! isn't it? 9000+ recursive instantiations? There has to be a better way.
So how long does the constexpr take for our fifth million prime number? 0.220 seconds! That's right! This large prime number makes almost no impact on constexpr compilation time. What could be the reason? There are no template instantiations. C++ compilers always did compile-time computations whenever it was possible. constexpr now allows user-defined abstractions to hide those computations behind functions and yet allow compilers to see through them.
By the way, I had to increase to depth of default allowed constexpr recursion using -fconstexpr-depth=9300. The compiler bails out after this limit and resorts to run-time computation. Remember, the function is perfectly good for run-time computation as well (unlike templates version).
I hope this little exercise will convince you that constexpr is the way to go for static meta-programming in C++ as long as you are dealing with literals (integral, floating point numbers, and strings). It is not clear whether constexpr functions are suitable for manipulating complex meta-programmable types, such as mpl::vector and related meta-functions, such as mpl::contains, mpl::push_back, and so on. If you are interested in playing with the above example more, here is the code: is_prime.cpp and prime-test.sh
The meta-programming language that uses templates was discovered accidently and since then countless techniques have been developed. It is a pure functional language which allows you to manipulate compile-time integral literals and types but not floating point literals. Most people find the syntax of template meta-programming quite abominable because meta-functions must be implemented as structures and nested typedefs. Compile-time performance is also a pain point for this language feature.
The generalized constant expressions (constexpr for short) feature allows C++11 compiler to peek into the implementation of a function (even classes) and perform optimizations if the function uses constants (literals) only. Constants can be integral, floating point, as well as string literals. The signature of constexpr functions is just like regular C++ functions but the body has several restrictions, such as only one return statement is allowed. Nevertheless, the syntax of constexpr functions is significantly friendlier than that of template-based meta-functions. Contrary to the design of templates, designers of generalized constant expressions are well-aware of its meta-programming capabilities.
In my view, the most interesting aspect of constexpr is its speed. constexpr functions can perform compile-time computations at lightening speed. To compare the performance I implemented an is_prime algorithm in 3 different ways. Here is the algorithm in regular C++:
static bool IsPrime(size_t number)
{
if (number <= 1)
return false;
for (size_t i = 2; i*i <= number; ++i)
if (number % i == 0)
return false;
return true;
}
Here is the a template version of the same algorithm. Obviously, it is implemented as a collection of meta-functions in terms of structures.
struct false_type
{
typedef false_type type;
enum { value = 0 };
};
struct true_type
{
typedef true_type type;
enum { value = 1 };
};
template<bool condition, class T, class U>
struct if_
{
typedef U type;
};
template <class T, class U>
struct if_<true, T, U>
{
typedef T type;
};
template<size_t N, size_t c>
struct is_prime_impl
{
typedef typename if_<(c*c > N),
true_type,
typename if_<(N % c == 0),
false_type,
is_prime_impl<N, c+1> >::type >::type type;
enum { value = type::value };
};
template<size_t N>
struct is_prime
{
enum { value = is_prime_impl<N, 2>::type::value };
};
template <>
struct is_prime<0>
{
enum { value = 0 };
};
template <>
struct is_prime<1>
{
enum { value = 0 };
};
The above meta-program is a recursive implementation because that's how pure functional languages work. The is_prime_impl meta-function does all the heavy lifting. To prevent infinite regression, lazy instantiation technique is used. All I can say at this point is you've to stare at the code to make sense out of it.
And that's precisely the point: C++03 meta-programs are often unreadable and hard to comprehend. Not anymore! Here is the constexpr version of the same algorithm.
constexpr bool is_prime_recursive(size_t number, size_t c)
{
return (c*c > number) ? true :
(number % c == 0) ? false :
is_prime_recursive(number, c+1);
}
constexpr bool is_prime_func(size_t number)
{
return (number <= 1) ? false : is_prime_recursive(number, 2);
}
Well, I agree, although this version uses friendlier syntax, is not quite easy to read. Just like the previous one, it is recursive. I would say just get used to it! You can't live without recursion in C++ meta-programming world. Secondly, every function has a single return statement and the codifying logic for detecting primality requires abundant use of the ternary operator.
As long as parameter number is an integral constant, this constexpr version will compute the result at compile-time (C++11 compilers only). And when the number is a run-time integer, this same function is perfectly capable of computing the result at run-time. So you don't need two different versions of the same program: one for compile-time and another for run-time. One implementation does it all!
int main(void)
{
static_assert(is_prime_func(7), "..."); // Computed at compile-time
int i = 11;
int j = is_prime_func(i)); // Computed at run-time
}
Now that we have the implementations, lets talk performance. Take 4256233. This is 30,000th prime number! How long does the template version take to check if it is prime? It cannot! Yes, g++ 4.7 fails to compile is_prime<4256233>::value because the computation exceeds the default (900) limit of the recursive template instantiations. So I used -ftemplate-depth-2100 allowing 2100 recursive instantiations! Does it work? Yes! How long does it take? About 1 second! That's not bad at all, you say. How fast is constexpr? About 0.154 seconds!
Seems like templates are not too far behind. Wrong! How about fifth million prime number! It is 86028121. I could not get it to work with g++ 4.7. Somewhere in the range of 9300 recursive instantiations g++ seg-faults. That's brutal! isn't it? 9000+ recursive instantiations? There has to be a better way.
So how long does the constexpr take for our fifth million prime number? 0.220 seconds! That's right! This large prime number makes almost no impact on constexpr compilation time. What could be the reason? There are no template instantiations. C++ compilers always did compile-time computations whenever it was possible. constexpr now allows user-defined abstractions to hide those computations behind functions and yet allow compilers to see through them.
By the way, I had to increase to depth of default allowed constexpr recursion using -fconstexpr-depth=9300. The compiler bails out after this limit and resorts to run-time computation. Remember, the function is perfectly good for run-time computation as well (unlike templates version).
I hope this little exercise will convince you that constexpr is the way to go for static meta-programming in C++ as long as you are dealing with literals (integral, floating point numbers, and strings). It is not clear whether constexpr functions are suitable for manipulating complex meta-programmable types, such as mpl::vector and related meta-functions, such as mpl::contains, mpl::push_back, and so on. If you are interested in playing with the above example more, here is the code: is_prime.cpp and prime-test.sh
Comments
But in terms of running this at runtime, would it be a hell lot of runtime recursion? Does C++0x have anything to say about tail recursion?
This would much more naturally explain why you get the exact same runtime for both constexpr evaluations: they're both performing the same number of steps before they give up.
Try:
constexpr i = is_prime_func(982451653);
I expect you'll get a compile-time error because the initializer is not a constant expression.
constexpr bool i = is_prime_func(982451653);
Could somebody explain how it differs from C++0x's constexpr?
I am curious, however: may you post timings for comparing runtime execution of the static and constexpr versions?
If the performance of the constexpr is lower at runtime, then we may still need two versions of the same function..
-Michal Mocny
No, it wasn't. Please don't call your blog "c++ truths" if you're going to spread misinformation like this. There is nothing accidental of any form to what Erwin Unruh built.
Stop repeating fictions you learned on Wikipedia.
Jesus, you really don't understand any of this, do you?
The call cost of a compile-time behavior is zero. "At lightning speed?" Is this all metaphor to you?
Check out:
1) http://tinyurl.com/45xkdlq
(http://groups.google.com/group/comp.lang.c++.moderated/browse_thread/thread/d9bddd4105f1441e?hl=en#)
"The above meta-program is a recursive implementation because that's how pure functional languages work."
Nonsense. That's how languages like Haskell, OcaML and Erlang work, but that has absolutely nothing to do with being a functional language. For example, CSS is a declarative functional language, and has no calling at all, let alone recursion; Prolog, F# and Postscript do not encourage recursion any more than does C, which is not functional; recursion is very common in C; et cetera.
"The is_prime_impl meta-function does all the heavy lifting. To prevent infinite regression, lazy instantiation technique is used. All I can say at this point is you've to stare at the code to make sense out of it."
The most obvious reaction here is "Have you even looked at that code? It doesn't instantiate anything. It's pretty obvious that the reason that all you've [sic] to say about it is read it: that's because you obviously have no idea how it works."
"And that's precisely the point: C++03 meta-programs are often unreadable and hard to comprehend"
For you, maybe. The thing you just botched explaining is actually clear as day, to someone who speaks C++."
Then you say you also can't read the constexpr version, which is hilarious, because it's a one-liner loop on modulus, suggesting that you really shouldn't be writing a software blog.
"You can't live without recursion in C++ meta-programming world. "
I use C++ TMP every day, and I very rarely do so recursively. You don't know what you're talking about.
"How fast is constexpr? About 0.154 seconds!"
If you're measuring how long the compiler takes and comparing it to how long the app takes in runtime, then you're making a fantastically inappropriate comparison.
That the two pieces of code aren't actually doing the same thing, hilariously, appears to be beyond you.
"9000+ recursive instantiations? There has to be a better way."
Yes, there is, and the typical college freshman can find it.
"That's right! This large prime number makes almost no impact on constexpr compilation time. What could be the reason?"
Well for one, you seem to have confused compile time with how long it takes to compile the source inside, and you appear to have no idea how benchmarking works.
"C++ compilers always did compile-time computations whenever it was possible. "
Incorrect. C++ truths my ass. You're just guessing and presenting your wrong guesses as facts.
"I hope this little exercise will convince you that constexpr is the way to go for static meta-programming in C++ as long as you are dealing with literals"
Translation: "I don't understand the difference between the two, so I'm going to say a bunch of wrong stuff then tell you which one to use."
"It is not clear whether constexpr functions are suitable for manipulating complex meta-programmable types"
It is entirely clear.
What a dishonest embarrassment.
It's a blog post, people say something that they just found out, it can be OK to you or not. But I don't see any point of posting such an aggressive comment.
It's not shameful at all to get wrong, but it's shameful to be as rude as that.
The primary reason this is a problem is that someone might make the mistake of beleiving him.
I'm sorry you imagine that it's worse to point out a lie than to tell the lie. However, it is not, and that's an aversion which causes very serious unnecessary problems.
@Ralph: IMHO, flattening tail recursion is implementation specific. Compilers may or may not be able to do that and their capabilities may vary. The is_prime_recursive is tail recursion 101 so I guess g++ 4.7 flattened it and that's why the performance of static and the recursive is identical.
@mmocny: Run-time were the same (0.004s) for regular version ("static" in the code examples) and the constexpr for really really large values. Note that for really really large values the answers were not computed at compilet-time due to limit of constexpr depth.
Check Chapter #1, section 4 of the “C++ Template Meta-programming” book by David Abrahams and Aleksey Gurtovoy. They talk about accidental discovery of C++ meta-programming.
Spot on with the examples of functional programming languages!
I've seen my code because I wrote it.
I’m glad someone thinks a C++ metaprogram is clear as day! When was it you last saw your code? You know there is more to programming than holding down the spacebar. Oh right! that’s infinite recursion in your part of the world.
Finally, I’m talking compilation time only. No run-time. Please go back and reread.
Now, read it again! Seriously!
But this holds for any function that does not modify external state. So I don't see why it has to be labeled constexpr in this case.
The aggressive tone of your comment is totally unnecessary. If you notice, Richard Smith up above you was perfectly capable of correcting a factual error without an aggressive tone.
And, in fact, template metaprogramming was discovered, not designed. The fact you could do metaprogramming with templates was accidental. It is true that Erwin Unruh was the first to discover this fact. But he didn't design the feature into C++ in the first place. He noticed it was there after it had already been accidentally added.
http://pastebin.com/P3v3eTxY
$ time g++ -std=c++0x -pedantic -Wall -Wextra -Ofast cefib.cpp
real 0m0.602s
user 0m0.042s
sys 0m0.027s
$ time ./a.out
fib(46) == 2971215073
fib(46) == 2971215073
real 0m9.163s
user 0m0.994s
sys 0m0.142s
http://dev-perspective.blogspot.com/2012/02/recursing-constexpr.html
http://dev-perspective.blogspot.com/2012/03/recursing-constexpr-part-ii.html
john haugeland - IS RIGHT !
Sorry, it might be an off topic, but i notice that you are interested in compilation-time containers.
boost::mpl containers have one critical disadvantage - container must change its name, while adding any element, so, you need to know last name of the container (which makes it almost useless).
Based on googling and some research, i found solution, which doesn't require to change the name of the container. You can find it here:
http://ideone.com/DUCeX
You need gcc-4.7 to compile it.
I've used some macros, but it needed only for readability, you can write the same without any macros.
This code have quite a bad style and almost certainly can be optimized.
Sorry for my English.
Let me start by saying that I can understand that people are willing to share their discoveries via a blog but also we need to be aware of the perspective John Haugeland brings in(at least for me), about the inexactities generated by our lack of knowledge or limited knowledge of the subject.
This is where I have to agree to John Haugeland: as a user coming to this blog to increase my knowledge about the constexpr topic I might get misleaded by the information inside the post, especially when it does not clearly make the distinctions between personal opinion and facts.
The "acid" comments of John Haugeland are probably his style but the message it brings should not be blurred by that.
opinions about whether constexpr
metaprogramming is faster than
template metaprogramming, at least
according to Zach Laine in this post:
http://lists.boost.org/Archives/boost/2014/05/213389.php
Would you care to comment?
-regards,
Larry
Thu May 22, 10:57:00 AM PDT
The discussions in the mailing list correctly point out that the speedup is gained mostly because of avoiding recursive template instantiasions of integral parameters.
Thanks for the response Sumant; however,
since the constexpr is_prime_func calls
a recursive function, is_prime_recursive,
and, the number of recursive function
calls is the same as the number of
recursive is_prime_impl class template
instantiations, I still don't understand why there's a difference
in times.
Could you explain a little more?
TIA.
-regards,
Larry
I also want to point out that C++14 constexpr is more relaxed, and you no longer need recursion.
Crude Oil HNI Free Tips