Writing Your Own CMake Functions
The main reason CMakePP can exist is that CMake allows users to write their own functions. While the process of declaring a function is straightforward there are some gotchas to keep in mind. This chapter focuses on avoiding those gotchas.
Function vs. Macro
CMake provides two types of user-defined functions “macros” and “functions”. The only difference is that “functions” create a new scope, whereas “macros” do not. This is probably easiest to explain with an example:
function(a_fxn)
set(my_x "Hello World")
endfunction()
macro(a_macro)
set(my_x "Hello World")
endmacro()
set(my_x "")
a_fxn()
message("my_x = '${my_x}'") # Will print "my_x = ''"
a_macro()
message("my_x = '${my_x}'") # Will print "my_x = 'Hello World'"
When designing functionality for CMakePP you should primarily use functions to avoid contaminating the caller’s namespace. That said, there are some circumstances where CMakePP developers will still want to use macros. Arguably, the most common circumstance is to break up long functions into smaller units. In this use case, which the macros are intended for use solely in the longer function and thus the lack of an additional scope does not affect the caller of the function because the macro’s manipulations are contained within the function’s scope. Generally speaking, if you use a macro instead of a function it is a good idea to explain why in the documentation.
Returning Values
Unfortunately, CMake does not allow functions or macros to return values in the traditional sense, that is to say the following is not valid CMake:
result = a_fxn(an_argument)
The only way to return values from a CMake function/macro is to set the value in
the parent scope (for functions; for macros all values are already accessible
from outside the macro). Historically this has been done by standardizing names.
For example, many of the older FindXXX.cmake
modules return their values by
setting variables like XXX_INCLUDE_DIRS
and the like. This is considered bad
practice now and should be avoided. The modern way to return values is to ask
the caller for an identifier and then use that identifier to return the result.
For example:
function(say_hello identifier_to_use)
set("${identifier_to_use}" "Hello World" PARENT_SCOPE)
end_function()
say_hello(result)
message("Value returned : ${result}") # Will be "Hello World"
This pattern allows the caller to specify the variable to use and avoids the
callee accidentally overwriting state in the caller’s scope (a somewhat common
problem of older, poorly written FindXXX.cmake
modules which have a tendency
to clobber variables like CMAKE_CXX_FLAGS
).
For functions with multiple return points the above pattern becomes:
function(say_hello result who_to_say_hi_to)
if("${who_to_say_hi_to}" STREQUAL "Bill")
set("${result}" "I guess I will say hello to Bill" PARENT_SCOPE)
return()
endif()
set("${result}" "Hello World" PARENT_SCOPE)
endfunction()
Of note you need to include the return()
command which returns control to
the caller (otherwise control will continue after the “if”-statement). These
scenarios occur often enough that CMakePP includes the cpp_return
function
which if called like: cpp_return(identifier)
will set identifier
in the
parent scope to the value ${identifier}
and return control to the caller.
Namespace Collisions
CMake does not have the equivalent of C/C++’s one-definition rule, e.g., the following is perfectly valid CMake:
function(add_library)
message("Look I overrode a native CMake command!!!")
endfunction()
add_library() # Calls our add_library
add_library(my_lib STATIC a_source_file.cpp) # Still our add_library call
Note in particular that the last line will run just fine because CMake allows
you to pass in more arguments than just the positional ones (the extra arguments
get passed as ${ARGN}
). It is thus essential that we avoid such collisions,
not just with CMake’s native commands, but also with commands from third party
modules. The latter in particular is a tall order. The CMakePP solution is to
use a per module prefix. For example, in the CMakeTest repository we use the
prefix ct_
. While not foolproof, when combined with type checking, this
strategy should avoid many collisions, and raise errors for most of the
remaining such collisions.
Variable Collisions
Even though functions create scope you have to be careful when your functions take identifiers. Consider this somewhat contrived example:
# Sets the identifier to 3
function(return_3 result)
set(${result} 3 PARENT_SCOPE)
endfunction()
# Sets the identifier to 32
function(return_32 input)
set(result 2)
return_3(${input})
set(${input} "${${input}}${result}" PARENT_SCOPE)
endfunction()
return_32(var)
message("Contents: ${var}") # prints "32"
return_32(result)
message("Contents: ${result}") # prints "33"
We define two functions. The first one takes an identifier and sets its value to 3. The second one’s intent is to set an identifier to the value 324 in a somewhat convoluted manner by:
Setting a temporary variable
result
to 2,having the
return_3
set the provided identifier to 3, andfinally concatenates the identifier’s value with the temporary’s value.
As long as the caller of return_32
doesn’t pass in an identifier named
result
, return_32
works correctly. If the caller does decide to call
their identifier result
, then the call to return_3
in return_32
overwrites the temporary of the same name so that the result of return_32
is
actually 33
.
While this example may seem contrived, one of the dangers of having to pass an
identifier in to a function in order to get a return is that if that identifier
clashes with any temporaries they get overridden. Particularly for common,
simple variable names (like result
) the caller picking an identifier
identical to your temporary happens more often than you might think (especially
when the same programmer wrote all the functions on the stack). To mitigate
this, it’s common to prefix function variable names with _
characters, the
idea being that users writing CMake are unlikely to declare variable names that
start with such characters. The problem is that in a framework like CMakePP we
end up nesting so many developer functions it’s actually quite likely for the
underscore prefixed names to clash too. Our solution is to mangle the name of
the function into the variable as well. While not foolproof it does help.