ClassRegistry
So let’s get to the questions first:
Is this a "correct" method of implementing a singleton in C++?
Yes, the singleton itself is fine, but the operations on that singleton are dangerous due to the lack of synchronization.
Is a singleton a sensible approach for storing the class constructors and methods?
It’s not a bad idea, but I don’t really see the need.
When a singleton is necessary, it’s usually because you need some proper setup and/or cleanup for a global operation or set of operations. But here, you don’t really. Once the maps are constructed, you’re ready to go. And there is no cleanup necessary outside of the map’s own destruction.
Put another way, if you throw out the singleton, the only difference between what you’d get and what you have now is you would no longer need to call GetInstance()
. You would just use MethodMap
and ConstructorMap
directly:
// Here we are just using a class as a "closed namespace". // // To convert from your code to this, I literally just went through it // deleting all the `GetInstance().`, then made the two data members // static and inline. class ClassRegistry { public: static void RegisterClass(const std::string& className, ConstructorFunc constructor, std::unordered_map<std::string, MethodFunc> classMethods) { ConstructorMap[className] = constructor; MethodMap[className] = classMethods; } static std::unique_ptr<Base> Create(const std::string& className) { if (ConstructorMap.contains(className)) { return ConstructorMap[className](); } throw std::runtime_error("Class not registered"); } static std::unordered_map<std::string, MethodFunc>& GetMethods(const std::string& className) { if (MethodMap.contains(className)) { return MethodMap[className]; } throw std::runtime_error("Class not registered"); } private: static inline std::unordered_map<std::string, ConstructorFunc> ConstructorMap; static inline std::unordered_map<std::string, std::unordered_map<std::string, MethodFunc>> MethodMap; };
Code review
I’m not sure that having two separate maps makes a lot of sense here. That means a lot of duplicated strings throughout memory and double the amount of map nodes and supporting infrastructure, like hash buckets.
You probably need a proper class info type
This seems like something that should really have a dedicated type. It just makes more sense to me that instead of having to call ClassRegistry::RegisterClass(class_name, class_constructor, class_methods)
, you should just be able to call ClassRegistry::RegisterClass(class_info)
, because those three things—the name, the constructor, and the methods—are not independent bits of data. They are all parts of one “thing”. So there should only be one “thing”: a single type to describe a class.
Once you do have that “class info” type, then having only a single map becomes the natural solution.
You are breaking encapsulation
There is another thing that bugs me about RegisterClass()
, and GetMethods()
as well, and that is that it exposes the fact that you are using a std::unordered_map
for the methods list. That should be an internal detail. In fact, because GetMethods()
returns a reference, you are pretty much requiring the internal structure of your registry. But there are a lot of reasons why you might not want to use a std::unordered_map
. If, after profiling, you notice that the number of methods is generally very small, you may choose instead to use a std::flat_map
, which can be many, many times faster for small data sets. Or maybe something else. The whole point of encapsulation is to give you the freedom to change the internal details. Exposing them defeats the purpose.
Again, this could be solved by a bespoke “class info” type that only offers a map-like interface for the methods. Inside the class, it could be holding a std::unordered_map
, or a std::flat_map
, or anything.
Maps are not the answer
On the topic of maps, I don’t really think maps are what you want to be using in any case: not for the class constructors, and not for the methods, either.
You see, maps are for mapping otherwise unrelated data together. But the method name and the method itself are most certainly not unrelated bits of data. Neither is the class name and the constructor: the constructor is literally the name. The class name will never map to any other class constructor, and a method name will never not map to that method. These bits of data are not separate, and should not be independent.
Mapping string names to things makes sense when the names and the things are independent. For example, if you are describing a film and mapping a character name to an actor, that makes sense as a map. The key and the value are two completely different things… that just happen to be mapped together in this context. You could swap out the actor, or change their role; the two things are independent of each other. That is what mapped data looks like.
Instead of maps connecting names to things, what you should probably have are just “things” with the names included as part of the thing, and they should be in sets. (Or, frankly, just sorted vectors… which is basically what flat sets are.)
Let me illustrate what I mean in code, using methods. Right now you describe a class’s methods like this:
std::unordered_map<std::string, MethodFunc> methods; // Finding a method's signature (this is wrong, but we'll get to that shortly): if (methods.contains("name")) return methods["name"];
The method name and the method call signature are separate bits of data.
Except… they’re not. A method is a single thing that has both a name and a call signature:
struct Method { std::string name; MethodFunc signature; };
And if you want to collect a bunch of them, you could put them in a set (or hashed set or flat set or… whatevs; I’m going to use a hashed set to illustrate). This makes sense: a class HAS-A name and a class HAS-A set of methods (technically a “multi-set” if you allow for overloads, but let’s keep it simple… in any case, a class does not have a “map of names to call signatures”).
// If no overload support, need only hash and compare by the name. struct MethodHash { using is_transparent = void; // to support transparent hashing constexpr auto operator()(Method const& m) const noexcept { return std::hash<std::string_view>{}(m.name); } // Transparently hash method names (strings) as methods. constexpr auto operator()(std::string_view name) const noexcept { return std::hash<std::string_view>{}(name); } }; struct MethodEqual { using is_transparent = void; // to support transparent equality comparisons constexpr auto operator()(Method const& a, Method const& b) const noexcept { return a.name == b.name; } // Transparently compare method names (strings) with methods. constexpr auto operator()(Method const& a, std::string_view b) const noexcept { return a.name == b; } constexpr auto operator()(std::string_view a, Method const& b) const noexcept { return a == b.name; } }; std::unordered_set<Method, MethodHash, MethodEqual> methods; // Finding a method's signature: if (auto const m = methods.find("name"); m != methods.end()) return m->signature;
Consider synchronization
Right now the registry is completely unsynchronized. It would be very dangerous to use in a multi-threaded application. And this is 2024. “Multi-threaded application” is the only kind worth considering these days.
Synchronization is trivial to add, and it needn’t affect performance all that much. Registration is a complete non-issue, since classes are only registered once for the whole program; pretty safe to assume classes are not being registered in a hot loop, and even if it might ever be an issue, you could always pre-register instead of doing the registration at a critical point.
There might be a concern about construction, since every time you call ClassRegistry::Create()
, it will have to be synchronized. But that seems moot, since every time you call ClassRegistry::Create()
, you’re doing a map lookup and dynamic allocation anyway. Clearly this is not a high-performance function.
But once you have the class or method data, so long as the address is stable and the data itself does not change (and it shouldn’t, logically), you can keep a pointer to it and do method lookups without synchronization.
Double lookups
This is an issue throughout your code. Whenever you look something up, you do something like this:
if (map.contains(thing)) return map[thing]; throw std::runtime_error{...};
The problem with that is that you are doing the lookup twice. First, with .contains()
, you do the lookup, traverse the whole map, find the thing you want, and then just… throw all that away and return true
. Then with operator[]
, you again do the lookup, traverse the whole map, find the thing you want, and then return it.
I think what you want is something more like this:
if (auto const p = map.find(thing); p != map.end()) return p->second; throw std::runtime_error{...};
That’s one dance through the map. If you find the thing, you hold onto it (via an iterator), and then return it.
Rough example
So I threw out a lot of suggests in very short order. Let me put them all together, albeit very roughly, to show how they work:
// Probably want to make MethodInfo and ClassInfo movable, but not copyable. class MethodInfo { std::string _name; MethodFunc _signature; public: auto name() const noexcept -> std::string const& { return _name; } auto invoke(MethodArgs const& args) const { return _signature(args); } }; // MethodHash and MethodEqual defined basically as above class ClassInfo { std::string _name; ConstructorFunc _constructor; std::unordered_set<MethodInfo, MethodHash, MethodEqual> _methods; public: auto name() const noexcept -> std::string const& { return _name; } auto construct() const { return _constructor(); } auto invoke(std::string_view name, MethodArgs const& args) const { if (auto const p = _methods.find(name); p != _methods.end()) p->invoke(args); throw std::runtime_error{"method not found"}; } }; // ClassHash and ClassEqual defined similarly to MethodHash and MethodEqual class ClassRegistry { static inline std::mutex _mutex; static inline std::unordered_set<ClassInfo, ClassHash, ClassEqual> _classes; public: static auto register(ClassInfo ci) -> ClassInfo const& { auto lock = std::scoped_lock{_mutex}; if (auto const [i, inserted] = _classes.emplace(std::move(ci)); inserted) return *i throw std::runtime_error{"already registered"}; } static auto get(std::string_view name) -> ClassInfo const& { auto lock = std::scoped_lock{_mutex}; if (auto const p = _classes.find(name); p != _classes.end()) return *p; throw std::runtime_error{"not found"}; } // Better to split up the get and create functions, so the get can // release the mutex before doing the create. // // You should *never* hold a mutex while calling into unknown code. // And the constructor function is unknown code. It may try to // register another class, and if it does, and you're still holding // the mutex... deadlock. static auto Create(std::string_view name) { return get(name).construct(); } };
With usage basically exactly the same as your current usage:
auto myClass = ClassRegistry::Create("MyClass"); myClass->Invoke("Hello", {}); myClass->Invoke("Add", { 5 });
Base
Once again, we’ll start with the questions first.
Is there a better method of retrieving the correct methods map other than passing the class name string up the constructor hierarchy?
Honestly, using the string is kind of a brittle system. And it may not be necessary.
As it is, you are kinda relying on classes to remember to not only pass their correct name to the base class (which could be a code maintenance issue if the class name changes), but also to correctly pass through any derived class names. There’s just so many ways this could break.
Let’s try rethinking this. What if, instead of using the class name as a lookup key into the registry, we used RTTI and std::type_info
or std::type_index
. (Or we could eat our cake and have it too, by making lookups with std::type_info
or std::type_index
transparent, as I showed how to do above for strings when looking up class info structs.)
Then suppose we don’t try to initialize the class info or the methods map in the constructor… which is a dodgy prospect in any case, because the constructor of the base class would be called before we know what the derived class is. Instead, we initialize it on the first call to Invoke()
. The extra cost is a single pointer check on each call to Invoke()
… which is moot because we’re about to do a lookup… and an extra registry lock and lookup on the first Invoke()
call.
Like so:
auto Base::Invoke(std::string_view method, const MethodArgs& args = {}) -> VariantValue { if (_methods == nullptr) { // Lookup by type info, not an arbitrary string name: _methods = &ClassRegistry::GetMethods(typeid(*this)); } // Continue as before... if (auto const p = _methods->find(method); p != _methods.end()) return (*_methods)[method](this, args); throw std::runtime_error("Method not found"); }
Is there a way to ensure all derivatives of Base
implement a GetMethods()
static function?
Yes… sort of.
The short answer is concepts. (Or static asserts with type traits, which is basically the same idea.)
The longer answer is that, no, you can’t ensure that all types derived from Base
have a static GetMethods()
function. But I would suggest that isn’t an issue, because you are trying to solve the wrong problem. You don’t really care whether every type derived from Base
has a GetMethods()
function. What you care about is whether every type that gets registered in the class registry has a GetMethods()
function.
That is an easy problem to solve. Simply do a concept check in your registration function.
Or, looking at from another perspective, what you care about is that any type derived from Base
for which anyone calls Invoke()
has a GetMethods()
function. Assuming you do a concept check when registering, you know every registered class satisfies this… so the problem is already solved. When ClassRegistry::GetMethods()
is called, it will throw an error if the class isn’t registered… so if it succeeds, that means the class is registered, and that means it must have a GetMethods()
static function. QED.
Code review
Polymorphic copying
Base
is intended to be a polymorphic base, and you need to be very careful about making polymorphic classes copyable.
You may want a virtual clone()
function, or with C++23’s explicit object arguments, it won’t need to be virtual, and won’t need to be implemented in every class.
There is another potential option in the future, called std::polymorphic
. But that seems still a ways off.
This class should be abstract
Base
is an abstract concept, not a concrete type. It should not be possible to create Base
objects.
So, for starters, the constructor—should it even need to exist—should be protected
.
There are no abstract functions so nothing to make pure virtual, but you could make the destructor pure virtual:
class Base { public: virtual ~Base() = 0; // ... [snip] ... }; inline Base::~Base() = default; // need this
You could also make it protected
if you wanted to.
While on the topic, you absolutely do not want Invoke()
to be overriden, so you should probably mark it final
.
CreateMethod
et al
Again, questions first:
Are there any ways to reduce code duplication here? There are four specialisations of CreateMethod
for dealing with void return types and const functions.
Absolutely.
Let’s tackle the easy one first: handling void
returns. Alls you needs is an if constexpr
with a invoke_result
check:
template<typename T, typename... Params> constexpr auto CreateMethodHelper(auto fn) { return [fn](Base* obj, MethodArgs const& args) -> VariantValue { if (args.size() != sizeof...(Params)) throw std::runtime_error{"Invalid number of arguments"}; if constexpr (std::is_void_v<std::invoke_result_t<decltype(CallMethodWithArgs<T, Params...>), Base*, decltype(fn), MethodArgs const&>>) { CallMethodWithArgs<T, Params...>(obj, fn, args); return {}; } else { return CallMethodWithArgs<T, Params...>(obj, fn, args); } }; }
That’s it!
You can probably also remove the duplicate for const
- and non-const
-qualified member functions. But there’s actually a bigger problem there.
So, after removing the void
overloads, we now have two overloads of CreateMethod()
. One takes const
member functions, the other takes non-const
member functions. But then both of them go on to call the same CreateMethodHelper()
function, which returns a lambda that works only with a non-const
Base*
. So, in fact, your const
overload doesn’t even work.
Let’s use those two overloads to specify the const
-ness of the derived type pointer, and use that turn const
on and off everywhere we need it.
While we’re at it, let’s get another easy win, and get rid of one of the helper functions. Also while we’re at it, let’s make the create functions consteval
and noexcept
, because they can only be called at compile time, and cannot fail. And let’s add some constraints:
template <std::derived_from<Base> T, typename... Args, std::size_t... Is> constexpr auto CallMethodWithArgsHelper(std::conditional_t<std::is_const_v<T>, Base const*, Base*> obj, auto f, MethodArgs const& args, std::index_sequence<Is...>) requires std::invocable<decltype(f), T*, Args...> and (sizeof...(Args) == sizeof...(Is)) { return std::invoke(f, *static_cast<T*>(obj), std::get<Args>(args[Is])...); } template<std::derived_from<Base> T, typename... Args> consteval auto CreateMethodHelper(auto f) noexcept requires std::invocable<decltype(f), T*, Args...> { return [f](std::conditional_t<std::is_const_v<T>, Base const*, Base*> obj, MethodArgs const& args) -> VariantValue { if (args.size() != sizeof...(Args)) throw std::runtime_error{"Invalid number of arguments"}; if constexpr (std::is_void_v<std::invoke_result_t<decltype(f), T*, Args...>>) { CallMethodWithArgsHelper<T, Args...>(obj, f, args, std::index_sequence_for<Args...>{}); return {}; } else { return CallMethodWithArgsHelper<T, Args...>(obj, f, args, std::index_sequence_for<Args...>{}); } }; } template<std::derived_from<Base> T, typename... Args> consteval auto CreateMethod(auto(T::*f)(Args...)) noexcept { return CreateMethodHelper<T, Args...>(f); } template<std::derived_from<Base> T, typename... Args> consteval auto CreateMethod(auto(T::*f)(Args...) const) noexcept { return CreateMethodHelper<T const, Args...>(f); }
Can that be simplified further? Possibly. But I’d be satisfied with that.
The macro
Is my use of a macro here warranted? I'm usually very against using them, but in this case I couldn't see an easier method of registering multiple classes.
I don’t think so.
First, let’s expand the macro, using “$” for the macro parameter, and “%” for the stringified macro parameter:
static bool registered_$ = []() { ClassRegistry::RegisterClass( %, []() -> std::unique_ptr<Base> { return std::make_unique<$>(%); }, $::GetMethods() ); return true; }();
Now I’m not sure why this is assigned to a boolean. I don’t see any evidence of it being used in the example code. I’m going to assume it’s not necessary.
So what we need to generate for a type is:
ClassRegistry::RegisterClass( %, []() -> std::unique_ptr<Base> { return std::make_unique<$>(%); }, $::GetMethods() );
Let’s put that in a template function with the type to be registered as a template parameter, and replace the placeholders:
template<typename T> auto register_class() { ClassRegistry::RegisterClass( %, []() -> std::unique_ptr<Base> { return std::make_unique<T>(%); }, T::GetMethods() ); }
So the only thing we still need is a way to stringify the type name. Luckily, that is a solved problem.
So we have:
template<typename T> auto register_class() { ClassRegistry::RegisterClass( type_name<T>, []() -> std::unique_ptr<Base> { return std::make_unique<T>(type_name<T>); }, T::GetMethods() ); }
We should add some constraints. T
must be derived from Base
, and must have a static GetMethods()
function:
template<std::derived_from<Base> T> requires requires { { T::GetMethods() } -> std::same_as<std::unordered_map<std::string, MethodFunc>>; } auto register_class() { ClassRegistry::RegisterClass( type_name<T>, []() -> std::unique_ptr<Base> { return std::make_unique<T>(type_name<T>); }, T::GetMethods() ); }
You could tweak the ad-hoc constraint, or, better yet, define a proper concept:
template<typename T> concept scriptable = std::derived_from<T, Base> and requires { { T::GetMethods() } -> std::same_as<std::unordered_map<std::string, MethodFunc>>; } // any other requirements ; template<scriptable T> auto register_class() { ClassRegistry::RegisterClass( type_name<T>, []() -> std::unique_ptr<Base> { return std::make_unique<T>(type_name<T>); }, T::GetMethods() ); }
The syntax to use it slightly different:
//REGISTER_CLASS(MyClass) register_class<MyClass>();
But if you must keep the current syntax, a macro that’s just #define REGISTER_CLASS(x) register_class<x>();
is tolerable.
Final question
Also, one final question I'd like to ask is whether or not there is a much easier method to do this sort of thing? (Or one that has already been implemented elsewhere so we aren't re-inventing the wheel).
I can’t think of any substantially easier methods as of C++20. But things may change radically in just three or four months. It is very likely reflection will be more-or-less standardized by that point… and reflection would make everything you’re doing absurdly easy. Like, comically easy.
Sutter just recently re-imagined his metaclass idea in terms of P2996 reflection. In plain C++-ese, that would mean your example class could look like this:
class(scriptable) MyClass { public: virtual void Hello() const { std::cout << "Hello from MyClass!\n"; } void Add(int a) { std::cout << "Add " << a << " to Value\n"; Value += a; } bool Compare(bool a, bool b) const { std::cout << "Compare " << a << " and " << b << ": " << (a && b) << '\n'; return a && b; } int GetValue() const { return Value; } private: int Value; };
That’s it. That’s everything. I left nothing out. With just that, that class would be registered, have all its public methods mapped, and you can immediately start using Create()
and Invoke()
right after the closing semicolon.
It would also probably be a progressive improvement. Meaning, you won’t need to throw out anything you’ve already written, but new stuff would be easier. So you don’t need to wait for the good stuff. You can make something workable today, and know that in a year or two everything will get better (assuming you want to drop support for pre-C++26 platforms).
I have only three things left to suggest:
Right now, all your registration has to happen at compile time… yet you use run-time structures like std::string
and std::unordered_map
.
It may be possible to do all the registration and generate all the maps at compile time, and store them in the data segment as constant data. You may even be able to use that data across compile- and run-time.
If you can manage that, you will probably see unbelievable performance gains. Probably worth experimenting with.
Your code currently basically ignores value category, and copies everything without even checking if moving is possible.
Just consider myClass->Invoke("Add", { 5 });
When doing this one simple statement:
- a
std::string
has to be constructed (for the "Add"
string), even though all that is needed is to do a string comparison with the constant character data in static memory - a
std::vector
has to be constructed (to hold the arguments) - the argument has to be copied into the function (here it is just an
int
, but it could be a std::string
, and it is a temporary, so it could be moved rather than copied)
That’s a hell of a lot of constructing and copying to just add five to zero.
Supporting moving—and by extension, forwarding references—probably won’t be trivial, but it will probably be worth it.
The variables are all passed in vectors. Is that really necessary.
That is going to be a massive cost, because vectors have no “small-string optimization”. Every time you want to pass even just a single int
argument… you need an allocation.
The first thing I would ask is if it is really necessary to support an infinitely variable number of arguments. Is there ever really going to be a need support more than, say, eight arguments? Because if you can cap the size at eight, you can use a fixed-size vector type, like std::inplace_vector
(that’s C++26, but there are tons of similar types, going back at least to C++11), and get, again, massive performance gains.
That’s it! Happy coding!