Skip to main content

Folding Functions

In the last post we looked at basic usage of C++17 Fold Expressions. I found that many posts on this topic discuss simple types and ignore how folds may be applicable to more complex types as well. [Edit: Please see the comments section for some examples elsewhere in the blogosphere.] In this post I'm going to describe folding over functions.

Composing Functions

Function composition is a powerful way of creating complex functions from simple ones. Functions that accept a single argument and return a value are easily composable. Consider the following example to compose two std::functions.
template <class A, class B, class C>
std::function<C(A)> compose(std::function<B(A)> f, std::function<C(B)> g)
{
  return [=](A a) -> C { return g(f(a)); };
}

int main(void)
{
    std::function<int(std::string)> to_num = [](std::string s) { return atoi(s.c_str()); };
    std::function<bool(int)> is_even = [](int i) { return i%2==0; };
    auto is_str_even_num = compose(to_num, is_even);
    std::cout << std::boolalpha << is_str_even_num("1234"); // prints true
}
Function compose accepts two std::function arguments and returns another one. The types of these std::function arguments are important. f is a function from A->B where as g is a function from B->C. Therefore, it makes sense that compose can generate another function of type A->C. The output f goes to the input of g. The implementation of the lambda confirms that.

As std::function is kind of verbose and not very idiomatic in C++ when you want to pass functions around. I'll try to use C++11 lambdas initially. I want to stay away from generic lambdas because argument and return types are kinda important. In generic lambdas, however, it's impossible to find their argument and return types without knowing an actual argument or its type. Note that in compose function we have access to functions only and no arguments.

Lets rewrite compose to accepts lambdas.
template <class F, class G>
auto compose(F&& f, G&& g)
{
  using ArgType    = detail::arg_type_t<F>;
  using ResultType = detail::result_type_t<G>;  
  return [f,g](ArgType a) -> ResultType { return g(f(a)); };
}
F and G are generic arguments, which we expect to be non-generic lambdas. We extract the argument type of F and result type of G and return a composition of two lambdas satisfying the type signature.

This implementation is not very idiomatic. Extracting the argument and return types of functions in this style is falling out of favor. std::function::argument_type and std::function::result_type have been deprecated in C++17. A more idiomatic way would have been to return a generic lambda without bothering the argument type. C++ clearly wants to favor duck-typing at compile-time. Until we've concepts in the language, of course.

I'll skip the implementation of the detail namespace. It's in the same vein as this stackoverflow answer.

Folding Functions

Folding functions is a generalization of function composition applied to fold expressions. First, we need to pick up an operator to use fold expressions with. I like >> as it's quite intuitive. Here's the earlier function implemented as an overloaded operator.
template <class F, class G>
auto operator >>(F&& f, G&& g)
{
  using ArgType    = detail::arg_type_t<F>;
  using ResultType = detail::result_type_t<G>;  
  return [f,g](ArgType a) -> ResultType { return g(f(a)); };
}
We'll now write a new compose function that uses a fold expression over function types. Of course, it's going to use the overloaded >> operator.
template <class... Funcs>
auto compose(Funcs... funcs)
{
   return (... >> funcs);   
}
The earlier main program will work just fine with this new implementation of compose. It works with lambdas too.
auto to_num = [](std::string s) { return atoi(s.c_str()); };
auto is_even = [](int i) { return i%2==0; };
auto is_str_even_num = compose(to_num, is_even);
std::cout << std::boolalpha << is_str_even_num("1234") << "\n"; // prints true
Interestingly, this compose function works fine with a single argument as it simply returns the argument as discussed in the previous post. It does not work with empty parameter pack however. What could we return when we get an empty parameter pack? In other words what would be the identity for the function type? Well, it's just a function that returns its argument. Let's see it in action using a binary fold.
struct Identity
{
  template <class T>
  T operator()(T&& t) { return std::forward<T>(t); }
};

template <class... Funcs>
auto compose(Funcs... funcs)
{
   return (Identity() >> ... >> funcs);   
}
Only problem, however, is that it does not compile. Not that anything is wrong with binary folds but the overloaded >> for generic functions cannot digest Identity. Identity has a generic function call operator. There's no way to get it's argument_type and result_type without knowing the type of the argument. The compose function does not have it.

We're therefore forced to use a generic implementation of operator >>.
template <class F, class G>
auto operator >>(F&& f, G&& g)
{
  return [f,g](auto a) { return g(f(a)); };
}
With this final variation, functions can be folded over in a binary fold expression.

I will conclude this blog post with a bit of monoid theory.

You might wanna ask yourself if function composition is another monoid? As it turns out, it is. It makes sense intuitively. Composition of two functions give rise to another function. The composition is also associative. It does not matter if we call compose(f, compose(g,h)) or compose(compose(f,g),h). The end result is the same. Squint a little and you will realize that they are just left and right folds. Finally, there's also an identity function, which when combined with any other function makes no observable difference. Therefore, we can say that function form a monoid under composition.

Next time we'll look at even more interesting functions---those return values wrapped a generic "container".

Comments

Sandeep. said…
Nice post.

I think the last statement about function composition being a monaid is a bit loaded though. It is a monoid only if the functions have the same domain and range.

> As std::function is kind of verbose and not very idiomatic in C++ when you want to pass functions around

As someone from C++ background, this is very true. Since functions are not first class citizens, a concept/pattern involving functions is hard to highlight in C++.
Anonymous said…
"I'll skip the implementation of the detail namespace. It's in the same vein as this stackoverflow answer. "

There is no link to the relevant stackoverflow post. Could you update this?
Sumant said…
@Sandeep Thanks. The statement appears "loaded" only if someone is unfamiliar to the concept of monoid. It's not true that functions form a monoid only when they have the same domain and range. The example functions to_num has string domain and int range. is_even has int domain and bool range. So they are clearly not the same. If you are referring to the fact that range of the first function needs to line up with domain of the next then you are talking about the composition operation. composition operation is itself illegal when that condition is not satisfied. Then the whole point is moot. Assuming function composition requirement is satisfied, what else can we say about function composition. Well, such functions form a monoid.

@Anonymous Here's the link: http://stackoverflow.com/questions/2562320/specializing-a-template-on-a-lambda-in-c0x/2565037#2565037
Anonymous said…
I believe an attribution to this article would be in order https://ngathanasiou.wordpress.com/2016/11/23/compose-and-curry-as-folds/ . Composition as folding with the function call operator is not that common and what's presented here is pretty much described in the linked article.
Anonymous said…
If you look at the definition of monoid, you'll realize that it is defined as a set with a binary operation satisfying certain conditions. That is, for every a,b in the set, a `op` b (and also b `op` a) is defined. If your chosen operation (composition in this case) doesn't satisfy this, you cannot say you have a monoid.

If it isn't clear enough: even if is_even composes with to_num, you still need to_num to compose with is_even, as an absolute requirement.
Sumant said…
@Anonymous I don't think that's the case. I think you are referring to commutative monoids, where a b = b a. A general monoid does not have that restriction. Please see http://mathworld.wolfram.com/CommutativeMonoid.html
Anonymous said…
No. I'm referring exactly to the concept of monoid. I didn't say that a b should equal b a, I said that both a b and b a should be defined. In the case of functions with function composition, if they don't have the same domain and codomain, you can easily find the case where a b is defined but b a is not.

Moreover, if you're into functional programming, I might better explain this using the (equivalent) category theoretic definition of monoid: a monoid is defined as a category with only one object, in which the elements of the monoid in the classical definition correspond to the morphisms over the object and the operation corresponds to the composition of morphisms (note that in this case, the domain and codomain of all morphisms is trivially the same).

You can see https://en.wikipedia.org/wiki/Monoid#Definition for the classic definition,
and https://en.wikipedia.org/wiki/Monoid#Relation_to_category_theory for the category theoretic definition.

By the way, I have quite liked this series of posts, I'd love to see more people building functional programming abstractions in C++.
Sumant said…
Ok. I understand your comment much better now. I was wondering for a while why you said "defined" as opposed to equal or something like that in the original comment. But I assumed a bit about what you might have meant.

Yeah, it turns out composition of functions as a monoid is fairly restrictive as the domain and codomain of the functions must be the type. Even single-argument-single-return-value functions are quite restrictive in imperative general-purpose languages. Currying is not practical in most such languages.

So is the following a correct from a category theoretic interpretation?
1. The "only object" in the monoid of function composition is the "type" of the functions, which has same domain and codomain types.
2. Each function in the family would be a morphism. Again all of them have the same type and their domain and codomain are the same.
3. Composition of any two yields another function, which is also a morphism.

Does this make sense?

Thanks much for providing feedback and helping me arrive at the right understanding.
Anonymous said…
Yes, that is a right category theoretic interpretation. I don't know if it is standard to interpret the object in the category as the "type" of the functions though (even if it is the domain and codomain of the morphisms), as in other kind of monoids the object could be basically anything as long as the morphisms are the objects of the monoid and composition is defined the same way as the operation. I'll explain a bit better:

Consider natural numbers (with zero) with addition. As long as you define the morphisms over the object to be the natural numbers, being the identity zero, and the composition of morphisms is the defined as addition, the object could be anything. So, in some examples, the object might mean nothing, only providing the structure to the monoid.

But yes, in this case, at least to me, that sounds like a quite good interpretation of the category theoretic definition. In fact, it makes easy to understand the definitions of category functor (https://en.wikipedia.org/wiki/Functor) and functional programming functor.

You're welcome :)

Popular Content

Unit Testing C++ Templates and Mock Injection Using Traits

Unit testing your template code comes up from time to time. (You test your templates, right?) Some templates are easy to test. No others. Sometimes it's not clear how to about injecting mock code into the template code that's under test. I've seen several reasons why code injection becomes challenging. Here I've outlined some examples below with roughly increasing code injection difficulty. Template accepts a type argument and an object of the same type by reference in constructor Template accepts a type argument. Makes a copy of the constructor argument or simply does not take one Template accepts a type argument and instantiates multiple interrelated templates without virtual functions Lets start with the easy ones. Template accepts a type argument and an object of the same type by reference in constructor This one appears straight-forward because the unit test simply instantiates the template under test with a mock type. Some assertion might be tested in...

Covariance and Contravariance in C++ Standard Library

Covariance and Contravariance are concepts that come up often as you go deeper into generic programming. While designing a language that supports parametric polymorphism (e.g., templates in C++, generics in Java, C#), the language designer has a choice between Invariance, Covariance, and Contravariance when dealing with generic types. C++'s choice is "invariance". Let's look at an example. struct Vehicle {}; struct Car : Vehicle {}; std::vector<Vehicle *> vehicles; std::vector<Car *> cars; vehicles = cars; // Does not compile The above program does not compile because C++ templates are invariant. Of course, each time a C++ template is instantiated, the compiler creates a brand new type that uniquely represents that instantiation. Any other type to the same template creates another unique type that has nothing to do with the earlier one. Any two unrelated user-defined types in C++ can't be assigned to each-other by default. You have to provide a...

Multi-dimensional arrays in C++11

What new can be said about multi-dimensional arrays in C++? As it turns out, quite a bit! With the advent of C++11, we get new standard library class std::array. We also get new language features, such as template aliases and variadic templates. So I'll talk about interesting ways in which they come together. It all started with a simple question of how to define a multi-dimensional std::array. It is a great example of deceptively simple things. Are the following the two arrays identical except that one is native and the other one is std::array? int native[3][4]; std::array<std::array<int, 3>, 4> arr; No! They are not. In fact, arr is more like an int[4][3]. Note the difference in the array subscripts. The native array is an array of 3 elements where every element is itself an array of 4 integers. 3 rows and 4 columns. If you want a std::array with the same layout, what you really need is: std::array<std::array<int, 4>, 3> arr; That's quite annoying for...