Separate Compilation of C++ Templates

So you’ve mastered the art of compartmentalizing your C/C++ code into well-organized compilation units, you’ve kept your headers and implementations cleanly separated and modularized, and your build system is sleek and happy. One day, you find a need to make heavy use of C++ templates, and suddenly things are not as clean anymore. It wasn’t long into my foray into template programming when I found myself asking: Why can’t I put my method bodies in a cpp file?

By its nature, C++ templates are designed to work best when every client sees every bit of the template – including procedure bodies – with the idea being that the compiler decides which instantiations need to happen. This is how the STL templates are used. But this approach can stress the compiler and slow down build times in certain situations, making it sometimes beneficial to organize template declarations and definitions in a way that allows separate compilation.

The situation in which this arises at GrammaTech is in our TSL system, for analysis of machine code. For each architecture (e.g., IA32, PPC) we generate a templated class containing methods that describe that architecture’s semantics. These templates are instantiated with analysis specifications (e.g., use-def analysis, affine-relations analysis) to yield implementations of use-def analysis for IA32, use-def analysis for PPC, affine-relations analysis for IA32, etc. As you can imagine, the templated classes are very big, and contain methods that are very big. Furthermore, analyses need to be able to specialize some of the semantics class methods via template specialization. The nature of these templates led to two big problems:

(1) when the large class method definitions were inlined in a header file that everyone sees, compile times were very slow (we managed now and then to trigger a compiler internal out-of-resources error)

(2) with template specialization being a poorly-understood artform, we often made mistakes not putting their declarations in the right place – a problem exacerbated by the fact that compilers do not complain when you mess this up, and that different compilers behaved differently when this happens, sometimes in bizarre and unexpected ways.

To streamline this code, we came up with a template (if you will) for how to organize such templates into a collection of header files. This is illustrated in Figure 1, with arrows indicating #include edges. Note that we have a rule that forbids #include’ing of cpp files — i.e., if something needs to be included by another file, even if it looks like a cpp file, give it a hpp extension. This rule also means that only cpp files are compilation unit.

Figure 1.

Inlined func1 cannot call any specialized methods!

F may need to include B/E/C in that order: B for Foo defn, E for specialization, C for code that may use specialized fn.

Example in DASH/Cash: T is trait class, Foo has typedef Bar BarT, but BarT has specializations that func1 calls!

Alt naming:
(A)/(D): _fwd.hpp
(B)/(E): _decl.hpp

This layout is intended for when method bodies (C) are big – i.e., we want to minimize the number of compilation units that see them – and when there are relatively few instantiations (F). With this layout, the big method bodies in (C) are only compiled once per instantiation (F). This layout would not work very well for STL-style templates, for example, where the number of instantiations tend to be arbitrarily large, as it would require the creation of an instantiator.cpp file for each instantiation (e.g., one for set<int>, one for set<unsigned int>, one for set<foo>, … for however many ways a given project uses set).

Note that one could have multiple copies of (C) and (F) if necessary. For example, in GrammaTech’s usage, Foo might be IA32Semantics, which contains many large method bodies, thus the compilation of each instantiator (F) is still quite slow. We could, instead, split (C) into multiple files (e.g., one per method), and correspondingly split (F) into multiple files as well. Splitting (F) might become a little onerous – as the template methods must be instantiated individually – but it is a tradeoff one could make if necessary to improve compilation times.

This organization includes support for method specialization (E). Use of this piece comes with some caveats. For example, if the body of func2 in (C) calls func1, it may be necessary that (F) includes (E) before including (C) – i.e., when compiling func2, the specialization of func1 in (E) must have been declared. Note, then, that this means the inline methods in (B) may not call any specialized methods – and the template designer ((A), (B), (C)) does not know beforehand which methods might be specialized. To address this, one could refine the picture by splitting (B) into two files; or for simplicity one could simply forbid inline methods.

Unfortunately, proper use of this “template” cannot be easily enforced or validated with a tool – its proper use relies on discipline and good understanding and infallibility – attributes which even the best of programmers must humbly admit cannot be guaranteed. Nonetheless, I hope that others find this template useful. Meanwhile, I await the arrival of stable implementations of C++11’s extern templates, though I don’t think it obviates the need for this organization.

Interested in the differences between free static analysis and advanced static analysis? Check out our guide to “Advanced Static Analysis for C++” here:

Separate Compilation of C++ Templates

Contact Us

Company