Just a few days ago I came across an intriguing blog-post about type-safe
Currying
Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument. I've talked about currying from very basics in a previous post. I'll jump straight to an example this time.
Function
Currying printf--dependently
Currying
To codify a string literal into a type, we are going to use the C++ language feature proposed in N3599. This proposal includes an example of dependently-typed printf that accepts all arguments at once. We're going to twist it a little bit to accept one argument at a time.
The magic lies in the
If the beginning of the
Fun does not end here though. We still need to curry printf. All we have at this stage is a sequence of types and that's big leap forward.
Let's assume you have a function
The target really feels within reach now. The
Here's the main driver program. Also on github.
Avoiding Copying Arguments
The previous example makes a massive assumption that all arguments are fundamental types. That they are cheap to copy. The lambda inside the apply function captures the arguments by value and passes them on by value. The arguments are copied O(N*N) times approximately. That's gonna hurt for large types that are expensive to copy.
The Remedy is to
Currying Arbitrary Functions
To work around this problem I'm simply going to use a dynamically allocated tuple that will store the arguments as they come in. As curried function may be copied multiple times, this scheme should work out quite efficiently in such cases.
Conclusion
While currying C++ functions is fun, lifting C++ string literals to type-level opens up a whole new level of meta-programming in C++. constexpr functions can operate on string literals and compute integral results at compile-time. See this for an example. With constexpr function, however, we can't construct new types at compile-time depending upon the argument value. N3599 allows us to cross the string-to-type barrier at compile-time. That's pretty neat. I can already think of some intriguing applications of N3599 in serialization/deserialization of user-defined types.
printf
using dependent typing. The blog-post has since become inaccessible and therefore, I've copied an excerpt here. I want to thank Zesen Qian for publishing this blog-post.I thought it might be possible to achieve the same effect in C++. ..... printf
originated from the C programming language and has been a headache since then because a proper call ofprintf
requires the number and types of arguments to match the format string; otherwise, it may visit a wild memory or a wrong register. In recent versions of GCC this issue is addressed by type checks hard coded into the compiler itself, which is ugly because the compiler should not be caring about the safety of applications of a specific function....
The key issue here is that, considering the curried version, the type of the returned function of applying the format string toprintf
, actually depends on the value of that format string. That is to say,printf "%s" : String -> String
, andprintf "%d%s%d" : Int -> String -> Int -> String
, and so on. This is where dependent type comes in: in dependently typed language, the type of returned values of functions can be dependent on value of the arguments; .... ---- Zesen Qian (ICFP'2106)
Currying
Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument. I've talked about currying from very basics in a previous post. I'll jump straight to an example this time.
int multiply(int i, int j) { return i*j; } auto curry = [](auto binary) { return [=](int i) { return [=](int j) { return binary(i, j); }; }; }; auto mul = curry(multiply); auto a = mul(10); auto b = a(20); std:: cout << b << std::endl; // prints 200
Function
multiple
takes both arguments at the same time. Function mul
, which is a curried version, takes one argument at a time. Intermediate results, such as a
, are themselves functions that take one of the remaining arguments. When all arguments are available, the original function evaluates producing a result.
Currying printf--dependently
Currying
printf
poses an extra challenge because (1) printf accepts a variable number of arguments and (2) the order of the types of the arguments is not fixed (past the first argument). More accurately, the order of the types of the arguments is determined by the format string. The format string of printf is a value---usually, a literal string. We want to make the types of the rest of the arguments dependent on the value of the first argument. That's pretty intriguing, imo. In effect, we need a way to codify the format string literal into a type and that's where the dependent-typing comes into play.
To codify a string literal into a type, we are going to use the C++ language feature proposed in N3599. This proposal includes an example of dependently-typed printf that accepts all arguments at once. We're going to twist it a little bit to accept one argument at a time.
The magic lies in the
operator ""
that converts a string literal into a type. Here's the code without further ado. Both clang and gcc support this extension. Perhaps it will be in C++17 standard soon or it already is.
#include <utility> template <char... chars> using CharSeq = std::integer_sequence<char, chars...>; template <typename T, T... chars> constexpr CharSeq<chars...> operator""_lift() { return { }; }The
CharSeq
type is a synonym for std::integer_sequence<char, ...>
. _lift
is a function that uses C++11 user-defined literals syntax convert a string literal to an equivalent CharSeq
at compile-time. For example, "cpptruths"_lift
returns std::integer_sequence<char,'c','p','p','t','r','u','t','h','s'>
. Check this code out.
#include <boost/core/demangle.hpp> auto cpptruths = "cpptruths"_lift; std::cout << boost::core::demangle(typeid(decltype(cpptruths)).name()) << "\n";Once a string is encoded as a type, a lot of things begin to fall into place using some additional template meta-programming. First, we need to codify the type-level
CharSeq
into a tuple of types that directly specify the types expected by printf. For instance, "%d"
expects an int
and "%s"
expects and const char *
etc.
We implement a meta-function called StringToTuple
.
template <class Head, class Tuple> struct Append; template <class Head, class... Args> struct Append<Head, std::tuple<Args...>> { using type = std::tuple<Head, Args...>; }; template <class CharSeq> struct StringToTuple; template <> struct StringToTuple<CharSeq<>> { using type = std::tuple<>; }; template <char Any, char... chars> struct StringToTuple<CharSeq<Any, chars...>> { using type = typename StringToTuple<CharSeq<chars...>>::type; }; template <char... chars> struct StringToTuple<CharSeq<'%', 's', chars...>> { using tail = typename StringToTuple<CharSeq<chars...>>::type; using type = typename Append<const char *, tail>::type; }; template <char... chars> struct StringToTuple<CharSeq<'%', 'd', chars...>> { using tail = typename StringToTuple<CharSeq<chars...>>::type; using type = typename Append<int, tail>::type; }; template <char... chars> struct StringToTuple<CharSeq<'%', 'f', chars...>> { using tail = typename StringToTuple<CharSeq<chars...>>::type; using type = typename Append<double, tail>::type; }; template <char... chars> struct StringToTuple<CharSeq<'%', 'u', chars...>> { using tail = typename StringToTuple<CharSeq<chars...>>::type; using type = typename Append<unsigned int, tail>::type; }; auto format = "%s%d"_lift; StringToTuple<decltype(format)>::type FormatTuple; // std::tuple<const char *, int>
StringToTuple
meta-function uses a pattern-matching. Consider the %s specialization. When the beginning of the CharSeq
is '%' followed by 's', the specialization matches recursively computes the type of the tail, which is tuple<int> in this case. The Append
meta-function simply concatenates the types in a tuple at the head.
If the beginning of the
CharSeq
is not a '%', the first most generic version with char Any
matches, which simply ignores the leading character.
Fun does not end here though. We still need to curry printf. All we have at this stage is a sequence of types and that's big leap forward.
Let's assume you have a function
curried_printf_impl
that accepts a format string and a CharSeq
as follows.
template <class CharSeq> auto curried_printf_impl(const char * fmt, CharSeq) { using FormatType = typename StringToTuple<CharSeq>::type; std::cout << boost::core::demangle(typeid(FormatType).name()) << "\n"; return curry<FormatType>::apply(fmt); } #define curried_printf(X) curried_printf_impl(X, X##_lift)We've not talked about the
curry
template yet. Of course, it's going to use the FormatType
tuple and turn it into a sequence of curried functions. The curried_printf
macro helps us cleanly separate the string literal from the compile-time character sequence into two separate arguments. ## is token-pasting operator in the C preprocessor.
The target really feels within reach now. The
curry
template is relatively straight forward.
template <class Tuple> struct curry; template <class Head, class... Tail> struct curry<std::tuple<Head, Tail...>> { template<class... Args> static auto apply(Args&&... args) { return [args...](Head h) { return curry<std::tuple<Tail...>>::apply(args..., h); }; } }; template <class Head> struct curry<std::tuple<Head>> { template <class... Args> static auto apply(Args&&... args) { return [args...](Head h) { return printf(args..., h); }; } }; template <> struct curry<std::tuple<>> { static auto apply(const char * fmt) { return printf(fmt); } };The general case of the
curry
template has an apply
function that accepts arbitrary number of arguments and returns a closure that captures all those arguments (from apply) and takes exactly one more argument of Head
type. As soon as it has the Head argument, it forwards it with all previous arguments to the subsequent curry<Tail...>::apply
to accept and retain remaining arguments one by one. The single argument curry
(the one with just Head), terminates the recursion and returns a lambda that upon receiving the last argument calls printf. Note that the format string literal is always at the beginning of args...
as curried_printf_impl
passes it along. If format string is the only argument, curry::apply calls printf right-away in the last no-argument specialization.
Here's the main driver program. Also on github.
int main(void) { curried_printf("C++ Rocks%s %d %f\n")("!!")(10)(20.30); curried_printf("C++ Rocks!!\n"); return 0; }If you mess up with the argument types, the error is short and relatively direct.
Avoiding Copying Arguments
The previous example makes a massive assumption that all arguments are fundamental types. That they are cheap to copy. The lambda inside the apply function captures the arguments by value and passes them on by value. The arguments are copied O(N*N) times approximately. That's gonna hurt for large types that are expensive to copy.
The Remedy is to
std::move
the arguments as much as possible. However, forwarding variadic arguments requires us to take some library help: std::tuple
.
template <class Head, class... Tail> struct curry<std::tuple<Head, Tail...>> { template<class... Args> static auto apply(Args&&... args) { return [t=std::make_tuple(std::move(args)...)](Head h) { // Move each element of t and h to curry<std::tuple<Tail...>>::apply somehow. }; } };It got complicated real fast. For each argument, we'll have to wrap them in a tuple and unwrap them before passing to
curry::apply
. Wrapping is easy. There's the code. Unwrapping is rather complicated because all arguments are not together in a tuple. Head
comes separately. std::apply
and std::invoke
did not appear particularly useful in this case. We perhaps need a direct syntax to expand tuple into function arguments. Secondly, there's at least one copy of each Head
argument anyway because the function should be type-safe and accept only Head
type argument in the lambda. I thought this is more trouble than it's worth.
Currying Arbitrary Functions
To work around this problem I'm simply going to use a dynamically allocated tuple that will store the arguments as they come in. As curried function may be copied multiple times, this scheme should work out quite efficiently in such cases.
// In C++17, std::experimental::apply can replace the following execute function. template <size_t... Indices, class Tuple, class Func> auto execute(std::integer_sequence<size_t, Indices...>, Tuple&& tuple, Func&& func) { return func(std::get<Indices>(std::forward<Tuple>(tuple))...); } template <int I, class AllArgs, class Tuple> struct dyn_curry; template <int I, class AllArgs, class Head, class... Tail> struct dyn_curry<I, AllArgs, std::tuple<Head, Tail...>> { enum { Index = std::tuple_size<AllArgs>::value - I }; template <class Func> static auto apply(std::shared_ptr<AllArgs> shptr, Func&& func) { return [shptr, func=std::move(func)](const Head &h) mutable { std::get<Index>(*shptr) = h; return dyn_curry<I-1, AllArgs, std::tuple<Tail...>>::apply(shptr, std::move(func)); }; } }; template <class AllArgs, class Head> struct dyn_curry<1, AllArgs, std::tuple<Head>> { enum { Index = std::tuple_size<AllArgs>::value - 1 }; using IntSeq = std::make_index_sequence<std::tuple_size<AllArgs>::value>; template <class Func> static auto apply(std::shared_ptr<AllArgs> shptr, Func&& func) { return [shptr, func=std::move(func)](const Head &h) mutable { std::get<Index>(*shptr) = h; return execute(IntSeq(), sd::move(*shptr), std::move(func)); }; } }; template <class Ret, class... Args> auto arb_curry(Ret (&func) (Args...)) { using AllArgs = std::tuple<std::decay_t<Args>...>; std::cout << boost::core::demangle(typeid(AllArgs).name()) << "\n"; std::shared_ptr<AllArgs> shptr(new AllArgs); return dyn_curry<std::tuple_size<AllArgs>::value, AllArgs, AllArgs>::apply(shptr, func); } template <class Ret> Ret arb_curry(Ret (&func) ()) { return func(); } int print_add(std::string &msg, int &j, int k) { std::cout << msg; return j+k; } int identity(int i) { return i; } int foo() { return printf("foo\n"); } int main(void) { arb_curry(foo); std::cout << arb_curry(identity)(99) << std::endl; auto a = arb_curry(print_add); auto b = a("Adding two integers: "); auto c = b(20); auto d = c(30); std::cout << d << std::endl; //prints 60. return 0; }There are three main differences in this more general implementation than the previous example.
- This implementation uses an explicit compile-time index to copy arguments in to the right slot in the tuple of arguments.
- There's more type related noise here because each call to apply passes the
shared_ptr
of the tuple type to the inner lambda. - The final dispatch to the function is implemented in the
execute
function that expands all the arguments in the tuple as function arguments. In C++17,std::experimental::apply
can replace the execute function.
Conclusion
While currying C++ functions is fun, lifting C++ string literals to type-level opens up a whole new level of meta-programming in C++. constexpr functions can operate on string literals and compute integral results at compile-time. See this for an example. With constexpr function, however, we can't construct new types at compile-time depending upon the argument value. N3599 allows us to cross the string-to-type barrier at compile-time. That's pretty neat. I can already think of some intriguing applications of N3599 in serialization/deserialization of user-defined types.
Comments