Saturday, November 05, 2016

Dependently-typed Curried printf in C++

Just a few days ago I came across an intriguing blog-post about type-safe printf using dependent typing. The blog-post has since become inaccessible and therefore, I've copied an excerpt here. I want to thank Zesen Qian for publishing this blog-post.

.... printf originated from the C programming language and has been a headache since then because a proper call of printf requires the number and types of arguments to match the format string; otherwise, it may visit a wild memory or a wrong register. In recent versions of GCC this issue is addressed by type checks hard coded into the compiler itself, which is ugly because the compiler should not be caring about the safety of applications of a specific function....

The key issue here is that, considering the curried version, the type of the returned function of applying the format string to printf, actually depends on the value of that format string. That is to say, printf "%s" : String -> String, and printf "%d%s%d" : Int -> String -> Int -> String, and so on. This is where dependent type comes in: in dependently typed language, the type of returned values of functions can be dependent on value of the arguments; .... ---- Zesen Qian (ICFP'2106)
I thought it might be possible to achieve the same effect in C++. .

Currying

Currying is the technique of transforming a function that takes multiple arguments in such a way that it can be called as a chain of functions, each with a single argument. I've talked about currying from very basics in a previous post. I'll jump straight to an example this time.

int multiply(int i, int j) { return i*j; }

auto curry = [](auto binary) {
  return [=](int i) {
    return [=](int j) { 
      return binary(i, j);
    };
  };
};

auto mul = curry(multiply);
auto a = mul(10);
auto b = a(20);
std:: cout << b << std::endl; // prints 200

Function multiple takes both arguments at the same time. Function mul, which is a curried version, takes one argument at a time. Intermediate results, such as a, are themselves functions that take one of the remaining arguments. When all arguments are available, the original function evaluates producing a result.

Currying printf--dependently

Currying printf poses an extra challenge because (1) printf accepts a variable number of arguments and (2) the order of the types of the arguments is not fixed (past the first argument). More accurately, the order of the types of the arguments is determined by the format string. The format string of printf is a value---usually, a literal string. We want to make the types of the rest of the arguments dependent on the value of the first argument. That's pretty intriguing, imo. In effect, we need a way to codify the format string literal into a type and that's where the dependent-typing comes into play.

To codify a string literal into a type, we are going to use the C++ language feature proposed in N3599. This proposal includes an example of dependently-typed printf that accepts all arguments at once. We're going to twist it a little bit to accept one argument at a time.

The magic lies in the operator "" that converts a string literal into a type. Here's the code without further ado. Both clang and gcc support this extension. Perhaps it will be in C++17 standard soon or it already is.

#include <utility>

template <char... chars>
using CharSeq = std::integer_sequence<char, chars...>;

template <typename T, T... chars>
constexpr CharSeq<chars...> operator""_lift() { return { }; }
The CharSeq type is a synonym for std::integer_sequence<char, ...>. _lift is a function that uses C++11 user-defined literals syntax convert a string literal to an equivalent CharSeq at compile-time. For example, "cpptruths"_lift returns std::integer_sequence<char,'c','p','p','t','r','u','t','h','s'>. Check this code out.
#include <boost/core/demangle.hpp>

auto cpptruths = "cpptruths"_lift;
std::cout << boost::core::demangle(typeid(decltype(cpptruths)).name()) << "\n";
Once a string is encoded as a type, a lot of things begin to fall into place using some additional template meta-programming. First, we need to codify the type-level CharSeq into a tuple of types that directly specify the types expected by printf. For instance, "%d" expects an int and "%s" expects and const char * etc. We implement a meta-function called StringToTuple.
template <class Head, class Tuple>
struct Append;

template <class Head, class... Args>
struct Append<Head, std::tuple<Args...>>
{
  using type = std::tuple<Head, Args...>;
};

template <class CharSeq>
struct StringToTuple;

template <>
struct StringToTuple<CharSeq<>>
{
    using type = std::tuple<>;
};

template <char Any, char... chars>
struct StringToTuple<CharSeq<Any, chars...>>
{
    using type = typename StringToTuple<CharSeq<chars...>>::type;
};

template <char... chars>
struct StringToTuple<CharSeq<'%', 's', chars...>>
{
    using tail = typename StringToTuple<CharSeq<chars...>>::type;
    using type = typename Append<const char *, tail>::type;
};

template <char... chars>
struct StringToTuple<CharSeq<'%', 'd', chars...>>
{
    using tail = typename StringToTuple<CharSeq<chars...>>::type;
    using type = typename Append<int, tail>::type;
};

template <char... chars>
struct StringToTuple<CharSeq<'%', 'f', chars...>>
{
    using tail = typename StringToTuple<CharSeq<chars...>>::type;
    using type = typename Append<double, tail>::type;
};

template <char... chars>
struct StringToTuple<CharSeq<'%', 'u', chars...>>
{
    using tail = typename StringToTuple<CharSeq<chars...>>::type;
    using type = typename Append<unsigned int, tail>::type;
};

auto format = "%s%d"_lift;
StringToTuple<decltype(format)>::type FormatTuple; // std::tuple<const char *, int> 

StringToTuple meta-function uses a pattern-matching. Consider the %s specialization. When the beginning of the CharSeq is '%' followed by 's', the specialization matches recursively computes the type of the tail, which is tuple<int> in this case. The Append meta-function simply concatenates the types in a tuple at the head.

If the beginning of the CharSeq is not a '%', the first most generic version with char Any matches, which simply ignores the leading character.

Fun does not end here though. We still need to curry printf. All we have at this stage is a sequence of types and that's big leap forward.

Let's assume you have a function curried_printf_impl that accepts a format string and a CharSeq as follows.
template <class CharSeq>
auto curried_printf_impl(const char * fmt, CharSeq)
{
  using FormatType = typename StringToTuple<CharSeq>::type;
  std::cout << boost::core::demangle(typeid(FormatType).name()) << "\n";
  return curry<FormatType>::apply(fmt);
}

#define curried_printf(X) curried_printf_impl(X, X##_lift)
We've not talked about the curry template yet. Of course, it's going to use the FormatType tuple and turn it into a sequence of curried functions. The curried_printf macro helps us cleanly separate the string literal from the compile-time character sequence into two separate arguments. ## is token-pasting operator in the C preprocessor.

The target really feels within reach now. The curry template is relatively straight forward.
template <class Tuple>
struct curry;

template <class Head, class... Tail>
struct curry<std::tuple<Head, Tail...>>
{
    template<class... Args>
    static auto apply(Args&&... args) 
    {   
      return [args...](Head h) {
          return curry<std::tuple<Tail...>>::apply(args..., h); 
      };  
    }   
};

template <class Head>
struct curry<std::tuple<Head>>
{
    template <class... Args>
    static auto apply(Args&&... args) {
        return [args...](Head h) { 
            return printf(args..., h); 
        };  
    }   
};

template <>
struct curry<std::tuple<>>
{
    static auto apply(const char * fmt) {
       return printf(fmt); 
    }   
};
The general case of the curry template has an apply function that accepts arbitrary number of arguments and returns a closure that captures all those arguments (from apply) and takes exactly one more argument of Head type. As soon as it has the Head argument, it forwards it with all previous arguments to the subsequent curry<Tail...>::apply to accept and retain remaining arguments one by one. The single argument curry (the one with just Head), terminates the recursion and returns a lambda that upon receiving the last argument calls printf. Note that the format string literal is always at the beginning of args... as curried_printf_impl passes it along. If format string is the only argument, curry::apply calls printf right-away in the last no-argument specialization.

Here's the main driver program. Also on github.
int main(void)
{
  curried_printf("C++ Rocks%s %d %f\n")("!!")(10)(20.30);
  curried_printf("C++ Rocks!!\n");

  return 0;
}
If you mess up with the argument types, the error is short and relatively direct.

Avoiding Copying Arguments

The previous example makes a massive assumption that all arguments are fundamental types. That they are cheap to copy. The lambda inside the apply function captures the arguments by value and passes them on by value. The arguments are copied O(N*N) times approximately. That's gonna hurt for large types that are expensive to copy.

The Remedy is to std::move the arguments as much as possible. However, forwarding variadic arguments requires us to take some library help: std::tuple.

template <class Head, class... Tail>
struct curry<std::tuple<Head, Tail...>>
{
    template<class... Args>
    static auto apply(Args&&... args) 
    {   
      return [t=std::make_tuple(std::move(args)...)](Head h) {
          // Move each element of t and h to curry<std::tuple<Tail...>>::apply somehow.
      };  
    }   
};
It got complicated real fast. For each argument, we'll have to wrap them in a tuple and unwrap them before passing to curry::apply. Wrapping is easy. There's the code. Unwrapping is rather complicated because all arguments are not together in a tuple. Head comes separately. std::apply and std::invoke did not appear particularly useful in this case. We perhaps need a direct syntax to expand tuple into function arguments. Secondly, there's at least one copy of each Head argument anyway because the function should be type-safe and accept only Head type argument in the lambda. I thought this is more trouble than it's worth.

Currying Arbitrary Functions

To work around this problem I'm simply going to use a dynamically allocated tuple that will store the arguments as they come in. As curried function may be copied multiple times, this scheme should work out quite efficiently in such cases.
// In C++17, std::experimental::apply can replace the following execute function.

template <size_t... Indices, class Tuple, class Func>
auto execute(std::integer_sequence<size_t, Indices...>,
             Tuple&& tuple,
             Func&& func)
{
  return func(std::get<Indices>(std::forward<Tuple>(tuple))...);
}

template <int I, class AllArgs, class Tuple>
struct dyn_curry;

template <int I, class AllArgs, class Head, class... Tail>
struct dyn_curry<I, AllArgs, std::tuple<Head, Tail...>>
{
    enum { Index = std::tuple_size<AllArgs>::value - I };

    template <class Func>
    static auto apply(std::shared_ptr<AllArgs> shptr, Func&& func)
    {   
      return [shptr, func=std::move(func)](const Head &h) mutable {
        std::get<Index>(*shptr) = h;
        return dyn_curry<I-1, AllArgs, std::tuple<Tail...>>::apply(shptr, std::move(func));
      };  
    }    
};

template <class AllArgs, class Head>
struct dyn_curry<1, AllArgs, std::tuple<Head>>
{
    enum { Index = std::tuple_size<AllArgs>::value - 1 };
    using IntSeq = std::make_index_sequence<std::tuple_size<AllArgs>::value>;

    template <class Func>
    static auto apply(std::shared_ptr<AllArgs> shptr, Func&& func)
    {   
      return [shptr, func=std::move(func)](const Head &h) mutable {
        std::get<Index>(*shptr) = h;
        return execute(IntSeq(), sd::move(*shptr), std::move(func));
      };  
    }
};

template <class Ret, class... Args>
auto arb_curry(Ret (&func) (Args...))
{
  using AllArgs = std::tuple<std::decay_t<Args>...>;
  std::cout << boost::core::demangle(typeid(AllArgs).name()) << "\n";
  std::shared_ptr<AllArgs> shptr(new AllArgs);

  return dyn_curry<std::tuple_size<AllArgs>::value, AllArgs, AllArgs>::apply(shptr, func);
}

template <class Ret>
Ret arb_curry(Ret (&func) ()) { return func(); }

int print_add(std::string &msg, int &j, int k) { std::cout << msg; return j+k;   }

int identity(int i) { return i; }

int foo() { return printf("foo\n"); }

int main(void)
{
  arb_curry(foo);
  std::cout << arb_curry(identity)(99) << std::endl;
  auto a = arb_curry(print_add);
  auto b = a("Adding two integers: ");
  auto c = b(20);
  auto d = c(30);
  std::cout << d << std::endl; //prints 60.

  return 0;
}
There are three main differences in this more general implementation than the previous example.
  1. This implementation uses an explicit compile-time index to copy arguments in to the right slot in the tuple of arguments. 
  2. There's more type related noise here because each call to apply passes the shared_ptr of the tuple type to the inner lambda. 
  3. The final dispatch to the function is implemented in the execute function that expands all the arguments in the tuple as function arguments. In C++17, std::experimental::apply can replace the execute function.
Here's live code and also on github.

Conclusion

While currying C++ functions is fun, lifting C++ string literals to type-level opens up a whole new level of meta-programming in C++. constexpr functions can operate on string literals and compute integral results at compile-time. See this for an example. With constexpr function, however, we can't construct new types at compile-time depending upon the argument value. N3599 allows us to cross the string-to-type barrier at compile-time. That's pretty neat. I can already think of some intriguing applications of N3599 in serialization/deserialization of user-defined types.