Callbacks and Lambdas in CMake
Callbacks are functions which are provided to another function as input. This differs from a typical nested function call in that it is possible to change the “value” of a callback at runtime. Put another way, nested function calls can be thought of as hard-coded callbacks. Lambdas are functions that are written on-the-fly and using the runtime state of the program in their definitions. Lambdas are almost always used as callbacks hence their relevance to the present discussion. Supporting callbacks and lambdas greatly facilitates generic programming. CMake does not natively support callbacks, but it does natively support lambdas (users can declare functions and macros almost anywhere); however, without callback support these lambdas must be run in-place and can not be passed to functions, which largely defeats the purpose of a lambda. This chapter focuses on techniques to implement callbacks in CMake.
Prerequisites
Before we describe some of the solutions out there it is worth explicitly noting some points about the CMake language so that we understand the limitations the patterns are constrained by.
CMake Function Names Can NOT be Variables
As the subsection title says CMake does not allow function names to be variables. Specifically, the following is NOT valid CMake:
function(my_fxn1)
endfunction()
function(my_fxn2)
endfunction()
set(fxn_name my_fxn1)
if(some_condition)
set(fxn_name my_fxn2)
endif()
${fxn_name}()
Here the intent is that we default to calling my_fxn1
, but if a certain
condition is met, some_condition
, we want to call my_fxn2
. Everything is
valid CMake until the last line; CMake does NOT allow us to get the
function’s name from a variable. CMake also does not allow any part of the
function’s name to come from a variable, i.e., this is NOT valid either:
function(is_valid_cxx_code code)
# Check if ${code} is valid C++
endfunction()
function(is_valid_python_code code)
# Check if ${code} is valid Python
endfunction()
set(lang cxx)
is_valid_${lang}_code("x = [y in range(3)]")
If by some chance either of these restrictions are removed in a later release of CMake this chapter becomes nothing more than a historical curiosity.
CMake’s include Command
While there is a tendency to group all calls to CMake’s native include
command at the top of a file, the reality is these calls can be used almost
anywhere. Of particular relevance include
can be used in a function like:
function(my_fxn)
include(CMakeDependentOption)
cmake_dependent_option(...)
endfunction()
my_fxn()
This code writes a function, which includes the CMakeDependentOption
module
(the module facilitates declaring options that depend on other options) and then
calls the cmake_dependent_option
function introduced by that module. Next
this code snippet calls the function we just defined.
The arguments to ``include``can be a variable. For example:
set(file_to_include a/path/to/a/CMake/module)
include(${file_to_include})
is perfectly acceptable CMake code.
CMake’s Function Command
Along the same lines of the include
subsection above, it is worth noting
that when defining a function the name can be a variable. For example:
set(fxn_name "my_fxn")
function(${fxn_name})
...
endfunction()
allows us to declare a function with a variable name. In fact, the entire signature can be a variable:
set(fxn_sig my_fxn arg1 arg2)
function(${fxn_sig})
...
endfunction()
declares a function my_fxn
which takes two positional arguments named
arg1
and arg2
.
That said, CMake will not allow the body of the function
command to come
from a variable, in other words the following is NOT valid CMake:
set(fxn_contents "message(hello world)")
function(my_fxn)
${fxn_contents}
endfunction()
Dispatch
The dispatch pattern can be used for callbacks when the list of possible
callbacks is explicitly known and can be enumerated while writing the file. Say
we have three possible functions we may want to call fxn1
, fxn2
, and
fxn3
, the dispatch pattern looks like:
function(call_a_fxn fxn_name)
if("${fxn_name}" STREQUAL "fxn1")
fxn1()
elseif("${fxn_name}" STREQUAL "fxn2")
fxn2()
elseif("${fxn_name}" STREQUAL "fxn3")
fxn3()
else()
message(FATAL_ERROR "Function ${fxn_name} not found."
endif()
endfunction()
# Call fxn2 for example
call_a_fxn(fxn2)
The code should be self-explanatory. Arguments to functions in the dispatch pattern are best treated as kwargs unless all functions have the same exact signature.
Admittedly this pattern does not actually implement callbacks, rather it
simulates them. By itself this pattern is only feasible when the number of
“callbacks” is small and known when call_a_fxn
is being written. It’s
important to note that this solution introduces a layer of indirection, but does
not require writing a file to disk (you likely will have to read include files
for the various functions you dispatch among).
Initializer Function Pattern
I first saw this named pattern in the
CMake++ library. The essence of this
pattern is that we generate an implementation file on-the-fly, and then include
that file run the contents. In CMake++ the pattern works by assuming we have the
code we want to call in a string. CMake++ then defines a function eval
like:
function(eval contents)
set(temp_file path/where/temp/file/should/go)
file(
WRITE ${temp_file}
"function(eval contents)
file(WRITE ${temp_file} \"\${contents}\")
include(${temp_file})
endfunction()"
)
include(${temp_file})
eval("${contents}")
endfunction()
eval
can then be used like:
eval("message(\"hello world\")")
Note
This code is actually quite dense, this note provides a breakdown of how it works.
User calls
eval
with the contents they want to run.Call to
eval
happens in scopeA
Inside
eval
is scopeA::B
A::eval
writes a second version ofeval
to a temporary file.A::eval
includes the temporary fileIncluding file runs it, defining a new version of
eval
in scopeA::B
A::eval
runsA::B::eval
Scope inside
A::B::eval
isA::B::C
A::B::eval
writes the contents to a temporary fileUse the same temporary file because we don’t need the original anymore
A::B::eval
includes the new temporary fileThe user’s contents run in scope
A::B::C
Means
set(... PARENT_SCOPE)
only returns to scopeA::B
You may be curious what happens if you do not nest the file writes in
eval
. If you do not nest the file writes then only first call will work.
For example, without nesting the file writes:
eval("message(\"hello world\")")
eval("message(\"42\")")
will print "hello world"
twice. This is because A::B::eval
’s
definition actually replaces A::eval
.
This pattern is expensive in terms of computational resources as it involves two
file reads and two file writes. It also requires special attention if it is
going to be used in parallel as it is quite easy for multiple calls to overwrite
each other’s temporary files. As written the first call to eval
can not
return variables (subsequent calls can) since the content is actually run two
scopes down.
It is possible to simplify the above implementation:
function(eval contents)
set(temp_file path/for/temporary/file)
file(WRITE ${temp_file} "${contents}")
include(${temp_file})
endfunction()
This implementation does the same thing as CMake++’s eval
with less function
calls and I/O. This implementation also has the added benefit of only
introducing one nested scope on all calls, thus we can return variables like:
eval("set(x \"hello world\")")
message("x == ${x}") # Will print "x == hello world"
A major disadvantage of the eval
calls (both CMake++ and our optimized form)
is that the code to run has to be provided as a string. As shown in the code
examples, this means that special characters, like "
and ;
, will need
escaped, which is error-prone and tedious.
Visitor Pattern
This pattern works by agreeing on the name of the callback and its signature.
For example, say we are writing a function my_fxn
which generates two
variables and needs to compare them. Since it is reasonable that the user may
want to use a custom comparison operation, we decide that our function should
take a callback to do the comparison. By convention we agree that the callback
is named compare_my_variables
and has a signature like:
compare_my_variables(<result> <var1> <var2>)
we can then write our my_fxn
function like:
function(my_fxn comparison_file)
# Generate these variables somehow
set(var1 foo)
set(var2 bar)
include(${comparison_file})
compare_my_variables(result ${var1} ${var2})
endfunction()
and this is used like:
my_fxn(path/to/comparison_file)
If we think of the CMake module as an object and the function in the module as a method, this pattern looks a lot like the visitor pattern (one could argue it is actually duck typing, since the visitor pattern usually has a common base class, but oh well).