How to Shrink C++ Template Bloat: Practical Strategies for Leaner Code
This article explains why C++ template instantiation can cause code bloat and provides concrete, actionable strategies—such as extracting non‑template parts to base classes, using helper abstractions, avoiding unnecessary template parameters, and measuring binary size—to significantly reduce compiled binary size while keeping code maintainable.
Preface
Background: C++ templates are a powerful tool for writing generic, reusable code.
Problem: Template instantiation often leads to code bloat because the compiler generates separate code for each template instance. Modern compilers can deduplicate identical functions across translation units, but external templates and separating template from non‑template code no longer give much benefit, so we still need to focus on reducing the size of each instantiation.
The article provides concrete, practical optimization strategies for reducing the size of C++ template code in real‑world scenarios.
Strategy Overview
The main techniques include:
Extracting common parts from template functions.
Extracting common parts from template classes into a non‑template base class.
Using templates only when they truly add value.
Small tricks such as preferring composition over inheritance, avoiding large objects inside template functions, and measuring code bloat with tooling.
1. Extract Common Parts from Template Functions
If a part of a template function does not depend on the template parameters, move it to a regular (non‑template) function so that the code is generated only once.
// All Service implementations must provide this interface
class BaseService {
public:
virtual ~BaseService() = default;
virtual void onServiceInit() = 0;
// ... other members ...
std::string contextName{};
};
// Central manager for all Service singletons
class ServiceCenter {
public:
explicit ServiceCenter(const std::string& name) : _contextName(name) {}
template<typename T>
std::shared_ptr<T> getService(); // implementation shown later
private:
std::unordered_map<std::string, std::shared_ptr<BaseService>> _serviceMap{};
std::string _contextName;
std::recursive_mutex _mutex;
};1.1 Simple case – most logic is independent of the template argument
In the simple version of getService(), the majority of the code does not use T. We can extract the locking, map lookup, and error handling into a non‑template helper:
class ServiceCenter {
public:
template<typename T>
std::shared_ptr<T> getService() {
auto key = typeid(T).name();
auto it = _serviceMap.find(key);
if (it == _serviceMap.end()) {
return nullptr;
}
auto service = it->second;
return std::dynamic_pointer_cast<T>(service);
}
// Non‑template version used by the helper
std::shared_ptr<BaseService> getService(const std::string& key) {
std::lock_guard<std::recursive_mutex> lock(_mutex);
auto it = _serviceMap.find(key);
if (it == _serviceMap.end()) {
return nullptr;
}
return it->second;
}
void setService(const std::string& key, const std::shared_ptr<BaseService>& service) {
// store the service
}
private:
std::unordered_map<std::string, std::shared_ptr<BaseService>> _serviceMap{};
std::string _contextName;
std::recursive_mutex _mutex;
};1.2 More complex case – logic depends on the template argument
When the function also creates a new instance, calls initialization, or performs error handling, we can abstract the type‑dependent parts behind a small helper interface:
class ServiceTypeHelperBase {
public:
virtual ~ServiceTypeHelperBase() = default;
virtual const char* getTypeName() const = 0;
virtual std::shared_ptr<BaseService> newInstance() const = 0;
virtual std::shared_ptr<void> castToOriginType(std::shared_ptr<BaseService> service) const = 0;
};
template<typename T>
class ServiceTypeHelper : public ServiceTypeHelperBase {
public:
const char* getTypeName() const override { return typeid(T).name(); }
std::shared_ptr<BaseService> newInstance() const override { return std::make_shared<T>(); }
std::shared_ptr<void> castToOriginType(std::shared_ptr<BaseService> service) const override {
return std::dynamic_pointer_cast<T>(service);
}
};Using the helper we can keep the core logic in a non‑template function:
class ServiceCenter {
public:
std::shared_ptr<void> getService(const ServiceTypeHelperBase* helper) {
std::lock_guard<std::recursive_mutex> lock(_mutex);
auto key = helper->getTypeName();
auto service = getService(key);
if (!service) {
auto tService = helper->newInstance();
tService->contextName = _contextName;
setService(key, tService);
tService->onServiceInit();
return helper->castToOriginType(tService);
} else {
auto tService = helper->castToOriginType(service);
if (!tService) {
aerror("ServiceCenter", "tService is null");
return nullptr;
}
return tService;
}
}
template<typename T>
std::shared_ptr<T> getService() {
ServiceTypeHelper<T> helper;
auto service = getService(&helper);
return std::static_pointer_cast<T>(service);
}
// ... non‑template getService(key) / setService(key, service) from earlier ...
};2. Extract Common Parts from Template Classes to a Base Class
Note: The base class must be a non‑template class or a template class with fewer parameters than the derived class; otherwise you only move the code around without reducing instantiation size.
When a template class is instantiated, the compiler generates a copy of every member—both template and non‑template. By moving non‑template members and functions to a base class, we avoid generating duplicate code for each template argument.
2.1 Move non‑template members and functions to a base class
class BaseServiceCenter {
public:
void setService(const std::string& key, const std::shared_ptr<BaseService>& service);
std::shared_ptr<BaseService> getService(const std::string& key);
protected:
std::unordered_map<std::string, std::shared_ptr<BaseService>> _serviceMap{};
std::string _contextName;
std::recursive_mutex _mutex;
};
template<typename BaseService_t>
class ServiceCenter : public BaseServiceCenter {
public:
template<typename T>
std::shared_ptr<T> getService() {
// same logic as before, now reusing the base members
}
private:
std::string _businessName;
};2.2 Move template‑function common parts to the base class
class BaseServiceCenter {
public:
std::shared_ptr<void> getService(const ServiceTypeHelperBase* helper, const std::string& businessName) {
std::lock_guard<std::recursive_mutex> lock(_mutex);
auto key = helper->getTypeName();
auto service = getService(key);
if (!service) {
auto tService = helper->newInstance(businessName);
tService->contextName = _contextName;
setService(key, tService);
tService->onServiceInit();
return helper->castToOriginType(tService);
} else {
auto tService = helper->castToOriginType(service);
if (!tService) {
aerror("ServiceCenter", "tService is null");
return nullptr;
}
return tService;
}
}
// non‑template getService(key) / setService(key, service) remain unchanged
};
template<typename BaseService_t>
class ServiceCenter : public BaseServiceCenter {
public:
template<typename T>
std::shared_ptr<T> getService() {
static_assert(std::is_base_of<BaseService_t, T>::value, "Wrong Service Type");
BusinessServiceTypeHelper<T> helper;
auto service = getService(&helper, _businessName);
return std::static_pointer_cast<T>(service);
}
private:
std::string _businessName;
};2.3 Pull shared logic of multi‑parameter subclasses into a base with fewer parameters
When a class template has two parameters T and U, each instantiation creates n × m copies of the code. By moving the U -independent part to a base that only depends on T, we reduce the duplication to n copies.
template<typename T>
class PairLeft {
public:
std::string getTypeName() const {
const char* typeName = typeid(T).name();
int status = 0;
char* demangled = abi::__cxa_demangle(typeName, nullptr, nullptr, &status);
if (!demangled) return typeName;
std::string result = demangled;
free(demangled);
return result;
}
};
template<typename T, typename U>
class Pair : private PairLeft<T> {
public:
std::string getFirstTypeName() const { return PairLeft<T>::getTypeName(); }
T first;
U second;
};3. Use Templates Wisely
Do not use templates just for the sake of using them. The following example shows a library that provides an IoC container with many template parameters, leading to massive code bloat.
3.1 Redundant template parameters
Base class
RegistrationDescriptorBase<TDescriptor, TDescriptorInfo>never uses TDescriptor or TDescriptorInfo, yet each combination creates a separate type.
3.2 Bloated template combinations
Each chained call (e.g., as<...>().singleInstance().onActivated(...)) records its information in a new template instantiation, causing the binary size to grow linearly with the number of calls.
Instead of encoding all configuration in types, a plain POD struct can hold the data and be applied in a single step, while compile‑time checks (e.g., static_assert) can still enforce correctness.
4. Small Tricks
4.1 Prefer composition over inheritance
Composition reduces the number of template instantiations because each concrete type is stored as a runtime object rather than a compile‑time parameter.
template<typename Shape, typename Color>
class GraphicObject {
// heavy template instantiation per (Shape, Color) pair
};
class ShapeBase { /* ... */ };
class ColorBase { /* ... */ };
class GraphicObject {
public:
GraphicObject(std::shared_ptr<ShapeBase> shape, std::shared_ptr<ColorBase> color)
: shape_(shape), color_(color) {}
private:
std::shared_ptr<ShapeBase> shape_;
std::shared_ptr<ColorBase> color_;
};4.2 Avoid large objects inside template functions
Large objects cause each template instantiation to allocate extra memory. Use pointers, references, or move the objects outside the template.
4.3 Measure code bloat with tooling
On Linux/Android: nm --print-size --size-sort xxx_binary On macOS/iOS, generate a LinkMap file and inspect symbol sizes. Remember to measure the final linked binary, not individual object files, because modern compilers can deduplicate identical template functions across translation units.
Optimization Results
Applying the strategies to the WeChat client reduced a 14 MiB service library (24 services) to 11 MiB, a 22 % size reduction.
Conclusion
Reducing C++ template code size is essential for faster compilation and smaller binaries. The strategies described—extracting non‑template parts, limiting template parameters, preferring composition, and measuring bloat—help achieve leaner, more maintainable code without sacrificing functionality.
WeChat Client Technology Team
Official account of the WeChat mobile client development team, sharing development experience, cutting‑edge tech, and little‑known stories across Android, iOS, macOS, Windows Phone, and Windows.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
