5
\$\begingroup\$

UPDATE: Second revision on separate post. Runtime function overloading / dynamic dispatch for Python (2nd revision)


When I first started using Python I had a rough time dealing with some of it's dynamic-typed nature. In particular I was pretty used to leveraging the type system of other languages to be able to define polymorphic (or "overloaded") functions and methods. At first I implemented this pattern in Python using stubs and defining functions that do runtime type checks to switch between different behaviour, but that involved a lot of boilerplate, made the resulting functions and methods brittle and had a lot of down sides when working with member functions/methods. A copule of years a go I made this library*[1], that allows for a succint way of defining polymorphic functions using a decorator. I've used it in most of my Python projects ever since, but no one outside my team has ever seen it or reviewed it. So here it is.

I'd like to ask:

  1. Does the interface seem ergonomic?;
  2. is the code remotely readable/understandable?
  3. am I missing something, i.e., are there any glaring issues with the code?
  4. could the candidate selection strategy be improved?
  5. altought performance is not a main issue (it is a library for an interpreted language after all), are there any obvious areas where runtime overhead could be reduced? and
  6. how would one approach a rewrite to __call__ as to allow for the overloads to return proper types and not Any, to leverage type checking tools like MyPy.
Of course, I'd also love to read any thorough review or critique in terms of runtime overhead, general style (is the code "pythonic"?), or any other comment/note about the library.

""" =============== sobrecargar.py =============== Method and function overloading for Python 3. * Project Repository: https://github.com/Hernanatn/sobrecargar.py * Documentation: https://github.com/Hernanatn/sobrecargar.py/blob/master/README.MD Hernan ATN | [email protected] """ __author__ = "Hernan ATN" __license__ = "MIT" __version__ = "1.0" __email__ = "[email protected]" __all__ = ['overload'] from inspect import signature, Signature, Parameter, ismethod from types import MappingProxyType from typing import Callable, TypeVar, Iterator, ItemsView, OrderedDict, Self, Any, List, Tuple, Iterable, Generic from collections.abc import Sequence, Mapping from collections import namedtuple from functools import partial from sys import modules, version_info from itertools import zip_longest import __main__ if version_info < (3, 9): raise ImportError("Module 'sobrecargar' requires Python 3.9 or higher.") # Public Interface class overload(): """ Class that acts as a type-function decorator, allowing the definition of multiple versions of a function or method with different sets of parameters and types. This enables function overloading similar to that found in statically typed programming languages like C++. Class Attributes: _overloaded (dict): A dictionary that maintains a record of 'overload' instances created for each decorated function or method. Keys are function or method names, and values are 'overload' instances. Instance Attributes: overloads (dict): A dictionary storing the defined overloads for the decorated function or method. Keys are Signature objects representing overload signatures, and values are corresponding functions or methods. """ _overloaded : dict[str, 'overload'] = {} def __new__(cls, function : Callable)-> 'overload': """ Constructor. Creates a single instance per function name. Args: function (Callable): The function or method to be decorated. Returns: overload: The 'overload' class instance associated with the provided function name. """ full_name : str = cls.__full_name(function) if full_name not in cls._overloaded.keys(): cls._overloaded[full_name] = super().__new__(overload) return cls._overloaded[full_name] def __init__(self, function : Callable) -> None: """ Initializer. Responsible for initializing the overloads dictionary (if not already present) and registering the current version of the decorated function or method. Args: function (Callable): The decorated function or method. """ if not hasattr(self, 'overloads'): self.overloads : dict[Signature, Callable] = {} signature : Signature underlying_function : Callable signature, underlying_function = overload.__unwrap(function) if type(self).__is_method(function): cls : type = type(self).__get_class(function) for ancestor in cls.__mro__: for base in ancestor.__bases__: if base is object : break full_method_name : str = f"{base.__module__}.{base.__name__}.{function.__name__}" if full_method_name in type(self)._overloaded.keys(): base_overload : 'overload' = type(self)._overloaded[full_method_name] self.overloads.update(base_overload.overloads) self.overloads[signature] = underlying_function if not self.__doc__: self.__doc__ = "" self.__doc__ += f"\n{function.__doc__ or ''}" def __call__(self, *args, **kwargs) -> Any: """ Method that allows the decorator instance to be called as a function. The module's core engine. Validates the provided parameters and builds a tuple of 'candidates' from functions that match the provided parameters. Prioritizes the overload that best fits the types and number of arguments. If multiple candidates match, propagates the result of the most specific one. Args: *args: Positional arguments passed to the function or method. **kwargs: Nominal arguments passed to the function or method. Returns: Any: The result of the selected version of the decorated function or method. Raises: TypeError: If no compatible overload exists for the provided parameters. """ _C = TypeVar("_C", bound=Sequence) _T = TypeVar("_T", bound=Any) Candidate : namedtuple = namedtuple('Candidate', ['score', 'function_object', "function_signature"]) candidates : List[Candidate] = [] def validate_container(value : _C, container_parameter : Parameter) -> int | bool: type_score : int = 0 container_annotation = container_parameter.annotation if not hasattr(container_annotation, "__origin__") or not hasattr(container_annotation, "__args__"): type_score += 1 return type_score if not issubclass(type(value), container_annotation.__origin__): return False container_arguments : Tuple[type[_C]] = container_annotation.__args__ has_ellipsis : bool = Ellipsis in container_arguments has_single_type : bool = len(container_arguments) == 1 or has_ellipsis if has_ellipsis: aux_container_list : list = list(container_arguments) aux_container_list[1] = aux_container_list[0] container_arguments = tuple(aux_container_list) type_iterator : Iterator if has_single_type: type_iterator = zip_longest((type(t) for t in value), container_arguments, fillvalue=container_arguments[0]) else: type_iterator = zip_longest((type(t) for t in value), container_arguments) if not issubclass(type(value[0]), container_arguments[0]): return False for received_type, expected_type in type_iterator: if expected_type == None : return False if received_type == expected_type: type_score += 2 elif issubclass(received_type, expected_type): type_score += 1 else: return False return type_score def validate_parameter_type(value : _T, function_parameter : Parameter) -> int | bool: type_score : int = 0 expected_type = function_parameter.annotation received_type : type[_T] = type(value) is_untyped : bool = (expected_type == Any) default_value : _T = function_parameter.default is_null : bool = value is None and default_value is None is_default : bool = value is None and default_value is not function_parameter.empty param_is_self : bool = function_parameter.name=='self' or function_parameter.name=='cls' param_is_variable : bool = function_parameter.kind == function_parameter.VAR_POSITIONAL or function_parameter.kind == function_parameter.VAR_KEYWORD param_is_container : bool = hasattr(expected_type, "__origin__") or (issubclass(expected_type, Sequence) and not issubclass(expected_type, str)) or issubclass(expected_type, Mapping) is_different_type : bool if param_is_variable and param_is_container: is_different_type = not issubclass(received_type, expected_type.__args__[0]) elif param_is_container: is_different_type = not validate_container(value, function_parameter) else: is_different_type = not issubclass(received_type, expected_type) if not is_untyped and not is_null and not param_is_self and not is_default and is_different_type: return False elif param_is_variable and not param_is_container: type_score += 1 else: if param_is_variable and param_is_container: if received_type == expected_type.__args__[0]: type_score +=2 elif issubclass(received_type, expected_type.__args__[0]): type_score +=1 elif param_is_container: type_score += validate_container(value, function_parameter) elif received_type == expected_type: type_score += 4 elif issubclass(received_type, expected_type): type_score += 3 elif is_default: type_score += 2 elif is_null or param_is_self or is_untyped: type_score += 1 return type_score def validate_signature(function_parameters : MappingProxyType[str,Parameter], positional_count : int, positional_iterator : Iterator[tuple], nominal_view : ItemsView) -> int |bool: signature_score : int = 0 this_score : int | bool for positional_value, positional_name in positional_iterator: this_score = validate_parameter_type(positional_value, function_parameters[positional_name]) if this_score: signature_score += this_score else: return False for nominal_name, nominal_value in nominal_view: if nominal_name not in function_parameters: return False this_score = validate_parameter_type(nominal_value, function_parameters[nominal_name]) if this_score: signature_score += this_score else: return False return signature_score for signature, function in self.overloads.items(): length_score : int = 0 function_parameters : MappingProxyType[str,Parameter] = signature.parameters positional_count : int = len(function_parameters) if type(self).__has_var_args(function_parameters) else len(args) nominal_count : int = len({nom : kwargs[nom] for nom in function_parameters if nom in kwargs}) if (type(self).__has_var_kwargs(function_parameters) or type(self).__has_only_nom(function_parameters)) else len(kwargs) default_count : int = type(self).__has_default(function_parameters) if type(self).__has_default(function_parameters) else 0 positional_iterator : Iterator[tuple[Any,str]] = zip(args, list(function_parameters)[:positional_count]) nominal_view : ItemsView[str,Any] = kwargs.items() if (len(function_parameters) == 0 or not (type(self).__has_variables(function_parameters) or type(self).__has_default(function_parameters))) and len(function_parameters) != (len(args) + len(kwargs)): continue if len(function_parameters) - (positional_count + nominal_count) == 0 and not(type(self).__has_variables(function_parameters) or type(self).__has_default(function_parameters)): length_score += 3 elif len(function_parameters) - (positional_count + nominal_count) == 0: length_score += 2 elif (0 <= len(function_parameters) - (positional_count + nominal_count) <= default_count) or (type(self).__has_variables(function_parameters)): length_score += 1 else: continue signature_validation_score : int | bool = validate_signature(function_parameters, positional_count, positional_iterator, nominal_view) if signature_validation_score: this_candidate : Candidate = Candidate(score=(length_score+2*signature_validation_score), function_object=function, function_signature=signature) candidates.append(this_candidate) else: continue if candidates: if len(candidates)>1: candidates.sort(key= lambda c: c.score, reverse=True) best_function = candidates[0].function_object return best_function(*args, **kwargs) else: raise TypeError(f"[ERROR] No overloads of {function.__name__} exist for the provided parameters:\n {[type(pos) for pos in args]} {[(k,type(nom)) for k,nom in kwargs.items()]}\n Supported overloads: {[dict(sig.parameters) for sig in self.overloads.keys()]}") def __get__(self, obj, obj_type): # class OverloadedMethod: __doc__ = self.__doc__ __call__ = partial(self.__call__, obj) if obj is not None else partial(self.__call__, obj_type) return OverloadedMethod() # Private Interface @staticmethod def __unwrap(function : Callable) -> tuple[Signature, Callable]: while hasattr(function, '__func__'): function = function.__func__ while hasattr(function, '__wrapped__'): function = function.__wrapped__ signature : Signature = signature(function) return (signature, function) @staticmethod def __full_name(function : Callable) -> str : return f"{function.__module__}.{function.__qualname__}" @staticmethod def __is_method(function : Callable) -> bool : return function.__name__ != function.__qualname__ and "<locals>" not in function.__qualname__.split(".") @staticmethod def __is_nested(function : Callable) -> bool: return function.__name__ != function.__qualname__ and "<locals>" in function.__qualname__.split(".") @staticmethod def __get_class(method : Callable) -> type: return getattr(modules[method.__module__], method.__qualname__.split(".")[0]) @staticmethod def __has_variables(function_parameters : MappingProxyType[str,Parameter]) -> bool: for parameter in function_parameters.values(): if overload.__has_var_kwargs(function_parameters) or overload.__has_var_args(function_parameters): return True return False @staticmethod def __has_var_args(function_parameters : MappingProxyType[str,Parameter]) -> bool: for parameter in function_parameters.values(): if parameter.kind == Parameter.VAR_POSITIONAL: return True return False @staticmethod def __has_var_kwargs(function_parameters : MappingProxyType[str,Parameter]) -> bool: for parameter in function_parameters.values(): if parameter.kind == Parameter.VAR_KEYWORD: return True return False @staticmethod def __has_default(function_parameters : MappingProxyType[str,Parameter]) -> int | bool: default_count : int = 0 for parameter in function_parameters.values(): if parameter.default != parameter.empty: default_count+=1 return default_count if default_count else False @staticmethod def __has_only_nom(function_parameters : MappingProxyType[str,Parameter]) -> bool: for parameter in function_parameters.values(): if parameter.kind == Parameter.KEYWORD_ONLY: return True return False 

English documentation

Description

sobrecargar is a Python module that includes a single homonymous class, which provides the implementation of a universal @decorator that allows defining multiple versions of a function or method with different sets of parameters and types. This enables function overloading similar to that found in other programming languages like C++.

Basic Usage

Decorating a Function:

You can use @overload[2] as the decorator for functions or methods.

from sobrecargar import overload @overload def my_function(parameter1: int, parameter2: str): # Code for the first version of the function ... @overload def my_function(parameter1: float): # Code for the second version of the function ... 

Decorating a method / member function:

Since sobrecargar interferes with the normal compilation flow of function code, and methods (member functions) are typically defined when defining the class, decorating methods requires special syntax. Attempting to use overload like this:

from sobrecargar import overload class MyClass: @overload def my_method(self, parameter1: int, parameter2: str): # Code for the first version of the method ... @overload def my_method(self, parameter1: float): # Code for the second version of the method ... 

Will produce an error like:

[ERROR] AttributeError: module __main__ does not have a 'MyClass' attribute. 

This happens because when overload tries to create the dispatch dictionary for the different overloads of my_method, the class named MyClass has not yet finished being defined, and therefore the compiler doesn't know it exists.

The solution is to provide a signature for the class before attempting to overload any of its methods. The signature only requires the class name and inheritance scheme.

from sobrecargar import overload class MyClass: pass # By providing a signature for the class, you ensure that `sobrecargar` can reference it at compile time class MyClass: @overload def my_method(self, parameter1: int, parameter2: str): # Code for the first version of the method ... @overload def my_method(self, parameter1: float): # Code for the second version of the method ... 

Edit: added an example

A more complete example that show a plausible use case, as requested in this comment

By far the most frequent use case I, personally have for function overloading is Class constructor overload. e.g., consider some rudimentary database record model.

Given a table Products:

idSKUTitleArtist (FK)DescriptionFormatPrice...
1A-123-C-77Jazz in Ba899...CD5.99...
2A-705-V-5We'll be togheter at last7566...Vynil8.99...
3B-905-C-5Ad Cordis123...CD3.99...
4B-101-C-77Brain Damage1222...CD3.99...
.....................

One could define a class Products that represents that table:

class SomeDbAbstraction: ... def run_query(query : str, ...) -> dict[str,Any]: ... def get_insert_id() -> int: ... ... class Format(Enum): _invalid = 0 CD = 1 Vynil = 2 class Artist: ... class Product: __slots__( "__id", "sku", "title", "artist", "description", "format", "price", ... ) def __init__( self, id : int, sku : str, title : str, artist : Artist, description : str, format : Format, price : float, ... ) -> None: self.__id = id self.sku = title self.title = title self.artist = artist self.description = description self.format = format self.price = price ... 

Let's say that the values for each column can come from vaired sources, e.g., a read from the database, a JSON api endpoint, an HTTP form, some other Python code, &c. Then we would need to define utility functions / classmethods that correctly handel each case, one possible implementation would be:

 @classmethod def fromId(cls, db : SomeDbAbstraction, id : int) -> 'Product': data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE id = {id};") return cls( data.get("id"), data.get("sku"), data.get("title"), data.get("artist"), data.get("description"), data.get("format"), data.get("price"), ... ) @classmethod def fromSku(cls, db : SomeDbAbstraction, sku : str) -> 'Product': data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE sku = {sku};") return cls( data.get("id"), data.get("sku"), data.get("title"), data.get("artist"), data.get("description"), data.get("format"), data.get("price"), ... ) @classmethod def newProduct( cls, db : SomeDbAbstraction, sku : str, title : str, artist : Artist, description : str, format : Format, price : float, ) -> 'Product': new_id = db.run_query(f""" INSERT INTO Product SET sku = {sku}, title = {title}, artist = {artist.id}, description = {description}, format = {format.name}, price = {price} ; """).get_insert_id() return cls( new_id, sku, title, Artist.fromId(db, artistId), description, format, price ) @classmethod def newProduct_w_artistId( cls, db : SomeDbAbstraction, sku : str, title : str, artistId : int, description : str, format : Format, price : float, ) -> 'Product': new_id = db.run_query(f""" INSERT INTO Product SET sku = {sku}, title = {title}, artist = {artistId}, description = {description}, format = {format.name}, price = {price} ; """).get_insert_id() return cls( new_id, sku, title, Artist.fromId(db, artistId), description, format, price ) @classmethod def newProduct_from_dict( cls, db : SomeDbAbstraction, data : dict ) -> 'Product': new_id = db.run_query(f""" INSERT INTO Product SET sku = {data.get("sku")}, title = {data.get("title")}, artist = {data.get("artistId")}, description = {data.get("description")}, format = {data.get("format.name")}, price = {data.get("price")}, ; """).get_insert_id() return cls( new_id, data.get("sku"), data.get("title"), Artist.fromId(db, data.get("artistId")), data.get("description"), data.get("format"), data.get("price") ) 

In each of those cases the user of the model needs to explicitly choose the function for that case. With sobrecargar one can, instead, provide an overloaded methods:

 class Product: __slots__( "__id", "sku", "title", "artist", "description", "format", "price", ... ) @overload def __init__( self, db : SomeDbAbstraction sku : str, title : str, artist : Artist, description : str, format : Format, price : float, ... ) -> None: new_id = db.run_query(f""" INSERT INTO Product SET sku = {sku}, title = {title}, artist = {artist.id}, description = {description}, format = {format.name}, price = {price} ; """).get_insert_id() self.__id = new_id self.sku = title self.title = title self.artist = artist self.description = description self.format = format self.price = price ... @overload def __init__(self, db : SomeDbAbstraction, id : int): data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE id = {id};") self.__id = data.get("id") self.sku = data.get("title") self.title = data.get("title") self.artist = Artist.fromId(db, data.get("artist")) self.description = data.get("description") self.format = Format(data.get("format")) self.price = data.get("price") @overload def __init__(self, db : SomeDbAbstraction, sku : str): data : dict[str,Any] = db.run_query(f"SELECT * FROM Product WHERE sku = {sku};") self.__id = data.get("id") self.sku = data.get("title") self.title = data.get("title") self.artist = Artist.fromId(db, data.get("artist")) self.description = data.get("description") self.format = Format(data.get("format")) self.price = data.get("price") @overload def __init__( self, db : SomeDbAbstraction sku : str, title : str, artistId : int, description : str, format : Format, price : float, ... ) -> None: new_id = db.run_query(f""" INSERT INTO Product SET sku = {sku}, title = {title}, artist = {artistId}, description = {description}, format = {format.name}, price = {price} ; """).get_insert_id() self.__id = new_id self.sku = title self.title = title self.artist = Artist.fromId(db, artistId) self.description = description self.format = format self.price = price ... 

Then, when using the model, one simply calls Product() with the relevant parameters, without having to explicitly choose each time the apropiate overload.

Note that the two implementations are not strictly equivalent, as the overloaded one disallows instantiation of a Product with both an id and record data. That difference is intended to highlight a feature of function overloading: it allows for the implicit (or emerging, if you'd like) definition of constraints. In this case that constraint serves as a guarantee that every Product object instaitated by id is an up to date representation of the record in the database. The first implementation allows construction of a Product object say, of id 5 that referes to a record with id=5 from db, but that can have arbitrary data.

Please keep in mind that this is an overly simplified example approximation of a real-use case. It's littered woth issues and bad practices (for instance there are no checks, nor sanitization, template strings are use for raw queryes, &c.)

Candidate selection strategy

Overloaded function signatures are evaluated and scored based on the match between provided arguments and expected parameters.

The process iterates over all registered overloads in self.overloads, where each overload is represented by a signature and its corresponding function.

1. Length Score

  • Evaluate argument length match between function signature and provided arguments.
    • If function signature has no parameters or no arguments are provided, and the number of signature parameters doesn't match the sum of positional and nominal arguments, the signature is ignored.
    • If the number of positional and nominal arguments exactly matches the signature parameters, and the signature has no variable parameters or default arguments, assign a high score of 3. This indicates a perfect length match.
    • If the argument count exactly matches signature parameters, but the signature has default arguments or variable parameters, assign a moderate score of 2.
    • If provided arguments are equal to or less than signature parameters, and the signature has default arguments or variable parameters, assign a score of 1. This indicates a partial length match.
    • In any other case, ignore the signature.

2. Signature Score

  • Evaluate type match based on function signature and argument types.
    • Use the validate_signature function to determine if argument types match expected signature types.
    • Assign a score based on type matching. If signature validation succeeds, obtain a positive score based on type compatibility.
    • If signature validation fails (returns False), ignore the overload.

3. Candidate List Construction

  • For each valid overload, create a Candidate object storing the overloaded function, corresponding signature, and calculated score. Type scoring takes precedence over length scoring.
  • Add candidates to the candidates list.

4. Best Candidate Selection

  • Check if candidates exist. If no valid candidates, raise a TypeError.
  • If multiple candidates exist, sort by scores, prioritizing highest scores. Select the candidate with the highest score as the preferred overloaded function.

5. Result

  • Call the preferred candidate with provided arguments and return its result.

Github repo:https://github.com/Hernanatn/sobrecargar.py
Avalibale in PyPi:pip install sobrecargar


[1] Note: both the library and it's documentation are written in spanish, this post presents a translation[2] Note: overload is an alias for sobrecargar baked into the library

\$\endgroup\$
20
  • 1
    \$\begingroup\$Hi @Reinderien I really don't get your quesiton... abc is a utility for defining abstract classes, that is, it enables defining base virtual classes that need futher subtyping and concrete "children" to be used, allowing polymorphism for types... It's a module that improves OOP inheritance patterns in Python. Its a different problem domain altogheter. abc does not provide dynamic dispatch for functions (nor methods), i.e., same-named funcs that take diverse parameters and produce diverse outputs - as it's not intended to, please see PEP 3119.\$\endgroup\$
    – HernanATN
    CommentedJan 26 at 19:08
  • \$\begingroup\$Ok @Reinderien, I'll add another, more complete example, but just for clarification: do you get that dynamic function dispatch and subtyping polymophism are two very distinct mostly unrelated topics? Why where you asking about abc? P.S. It's ok if you don't find a use case for function overloading. I do, that's why I made it\$\endgroup\$
    – HernanATN
    CommentedJan 26 at 19:45
  • 3
    \$\begingroup\$@HernanATN, I am trying to keep an open mind, but frankly I'm not yet seeing why a caller would use an overload. Would you please add some calling code (e.g. a unit test) from a motivating Use Case? Even a simple OO design such as several Vehicle types or Animal types would be useful. A common motivation for a pair of variant methods in a language like java is defaulting a parameter, while in python that problem simply never arises, we might use a signature of def display_parameters(verbose: bool = False): to default it.\$\endgroup\$
    – J_H
    CommentedJan 26 at 19:53
  • 4
    \$\begingroup\$Just my opinion: I have programmed in both C++ and Java, both strongly-typed languages, as well as Python for quite a few years. You are trying to make Python into something it was never meant to be. The strong typing of C++ and its approach to the various types of polymorphism have certain plusses to recommend it as does the duck typing of Python. You need to decide which language is the best (or possibly only) solution for a particular problem and use that instead of trying to create something that adds additional overhead and a programming style that is foreign to Python.\$\endgroup\$
    – Booboo
    CommentedJan 26 at 19:56
  • \$\begingroup\$@Reinderien It seems the OP is trying to implement function overloading whereby multiple functions can have the same name but different signatures. In C++, the compiler is able to statically analyze the source to see what arguments a function is being called with and then compile the call to the appropriate implementation. The OP's code is doing all of this analysis at runtime.\$\endgroup\$
    – Booboo
    CommentedJan 26 at 20:01

2 Answers 2

7
\$\begingroup\$
  1. Does the interface seem ergonomic?;

    Yes. Except having to define MyClass twice for methods. Which you may be able to solve by changing the algorithm to be lazily, which is also far more complicated.

  2. is the code remotely readable/understandable?

    For typing metaprogramming yes. But every time I've done typing metaprogramming I'm left with horrible code and the question "why do I do this to myself?"

  3. am I missing something, i.e., are there any glaring issues with the code?

    Your typing introspection seems somewhat basic. And doesn't seem to take into account annoying things like typed Python diverged from Python.

    I wrote a typing introspection library for Python 3.5.0+ a while ago. Not something I'd do again. I'd just use an existing typing introspection library which solved all the annoying parts for you. (Not mine, mine is dead.)

  4. how would one approach a rewrite to __call__ as to allow for the overloads to return proper types and not Any, to leverage type checking tools like MyPy.

    Welcome to typed Python, good luck. You probably won't have fun.

    My biggest problem with typed Python is you just can't do some things untyped Python can.

    However, when I typically write code I'll write a couple hundred lines then run the code. I'll tend to make a handful of mistakes. Type hints basically solve most of my issues; think wrong argument name or other typos. Now I'm left with annoying ones like off by one.

    As such if you can't fix the typing here I wouldn't find your library useful. I have to use an annoying subset of Python with no typing benefits.

  5. altought performance is not a main issue (it is a library for an interpreted language after all), are there any obvious areas where runtime overhead could be reduced? and

    Lets say you have:

    floats: Iterable[float] for f in floats: my_function(f) 

    Then you'll be calculating the function to use in my_function multiple times. We could pass the expected type to get the underlying function once.

    floats: Iterable[float] fn = my_function.get(float) for f in floats: fn(f) 

    The simple question I'm now met with is why would I want my_function.get(float) over just skipping the library and giving each function a different name?

    As such the solution here is either: accept the code will always be slow, or you need to write a bespoke Python compiler like -O/-OO or numba.jit. I don't think Python has an easy way to do the latter, especially with the typed syntax tree you'd need to interact with.

  6. Of course, I'd also love to read any thorough review or critique in terms of ... any other comment/note about the library.

    If you solve the MyClass and typing issues mentioned then your interface will probably be a bit degraded from the nice interface native languages provide.

    I really like metaprogramming, so I have written a couple of dynamic dispatch implementations over the years. I wouldn't again because I couldn't find a nice interface, I hated maintaining the code (typing introspection) and disliked the code being slow. Each time I was trying to solve the problems with my implementation I got fed up and just gave the functions different names. I've been defeated.

\$\endgroup\$
    12
    \$\begingroup\$

    I will offer some subjective opinions and objective observations.

    Benefits

    The main benefit of the overload pattern as I see it (including, but not exclusively, as implemented in your code) is that it can logically group a set of closely-related functions together. In a very large API this approach can reduce the number of function names that a programmer needs to remember. Finally, in a context where the function is supposed to logically act in a similar manner despite different-typed inputs, Python's duck-typing philosophy is that the function should attempt to plow ahead regardless of what it's been given, and if that's only possible by splitting the function out into different implementations based on the input types, then overloading can attempt to present a unified interface.

    Drawbacks

    Subjectively: if I were to see this code at work, I would recoil in horror. I would fear for what it implies in terms of added complexity and difficulty of debugging, both of this meta-library and the modules that use it.

    This code does not come without cost. It has to be tested and maintained, and anyone coming from a pure-Python background will need to learn how it works, so there's training cost. There is runtime performance cost. There is debugging cost: if something goes wrong in one of the overloads... which overload failed, and why was that one selected?

    To the logical-group aspect: there are better ways to logically group methods. Classes can do this when the functions are best-represented as methods; if they aren't, then the closest thing that Python has to a namespace is a module. I would be happy to see a small, purpose-written module with different but related functions - each having a different name and signature - that act as convenience wrappers for a core logic function.

    To the function memorisation aspect: all modern IDEs have some form of auto-complete, and so long as you choose your function names well, having different function names with similar prefixes will still allow for reasonable symbol traversal. I will offer that only having one function name acting as an interface to multiple frontend implementations hampers ergonomics rather than helping: if you are reading code and you see a function call out-of-context, which overload will it call? Are you sure?

    To the duck-typing aspect: the more you vary a single function's types, the more complexity balloons and static analysis and testability are hindered. If a function is only supposed to accept an int, testing and documenting it is easier than if it might accept an int or float or string or a JSON dictionary or a Numpy array or a Pandas dataframe or a Pandas series (this seems like an exaggeration but there are functions in the wild written to behave this way).

    In a world where API documentation matters, it's easier to write and read reference documentation for which each (dedicated) function signature has only one way that it can be used. Overloaded functions require that you sift through sub-sections of the function documentation to see which one applies.

    The capacity of dynamically-typed languages to clearly and structurally represent overloads is handicapped when compared to that of statically-typed languages (@Booboo correctly alludes to this in the comments). In statically-typed languages, the compiler can make a guarantee that signature inference is being performed in a predictable, well-defined, efficient manner, and is supported by a well-defined and well-documented system of warnings and errors in case of ambiguity or resolution failure. For these reasons, using overloads in e.g. C# is much safer than it is in Python, and I take less issue with using it there.

    In dynamically-typed languages, this inference is much more difficult and burdensome to perform. Python was not built with this in mind, and it's fairly square-peg-round-hole. PEP0443 describing functools.singledispatch does exist, and potentially could be used in place of your library under certain circumstances; but (a) it's really just a convenience around in-body type-checking; and (b) the new match syntax makes this type checking more convenient - so again, if the overload pattern is to be used (which I generally advise against), there's simple syntactic sugar that obviates a meta-library.

    As an excellent example of how an API can be muddied with overloading, the library I always love to pick on is matplotlib, for which signatures are so hopelessly overloaded that it's often impossible to determine what they do and don't support without guessing.

    In what you've intended as a counterexample, you say

    Then we would need to define utility functions / classmethods that correctly [handle] each case [...] In each of those cases the user of the model needs to explicitly choose the function for that case.

    Yes. Correct. That's a good thing, not a bad thing. You're already quite familiar with Python, so you've probably already read PEP0020 Zen which states as one of Python's guiding principles

    Explicit is better than implicit.

    Alternatives

    Objectively: This is a lot of code that doesn't need to exist, and everything that this library offers can be reduced in stages:

    1. Without using this library, plenty of existing code still uses signature overloading via *args, **kwargs. This can be annotated with typing.overload. However, this still lends itself to more complexity than is necessary.
    2. Just... write different functions with different names. No run-time type checking, no metaclasses, no decorators, no overloads. Add type hints for each signature, and if appropriate make those thin wrapper functions around a core logic function.

    In short, this is a whole lot of work for what I consider to be anywhere from a non-feature to an anti-feature. It's cool that you got it to work, and it's good to see that you do care about static analysis, but truly the best path to reliable, easy static analysis is to simply separate your signatures.

    \$\endgroup\$
    5
    • \$\begingroup\$@Reinderien I don't expect to convince you that, at least in some cases, the benefits outwieght the drawbacks for this approach. Regardless, thank you for taking the time to give an honest opinion on the pattern, you've pointed out some interseting things to consider. I do notice that you're answer refers only to the pattern in general and is not a review of the specific implementation, so some of the points made simply do not apply to this case (particularlly, the bits about debuging seem to not take into account the measures the library takes to that effect, etc) so I won't adress those. +\$\endgroup\$
      – HernanATN
      CommentedJan 26 at 22:36
    • \$\begingroup\$@Reinderien You praise the benefits of duck typing and in particularly the philosophy that "the function should attempt to plow ahead regardless (...)" but then you note that "if a function is only supposed to accept an int, testing and documenting it is easier than if it might accept an int or float or string or (...)", and then suggest "Just... write different functions with different names.". Besides being contradictory, it kneecaps the primary feature of dynamic typing: caller shouldn't be required to always know what types he's passing (but should be warned if nonsense) +\$\endgroup\$
      – HernanATN
      CommentedJan 26 at 22:43
    • \$\begingroup\$@Reinderien typing.overload is just a hint for LSPs and external linters / static checkers. It: 1. doesn't no make any garantees about what the actual implementation of the function accepts or returns, 2. it requieres extra boilerplate, 3. as it's only a hint, when the exact same set of instructions can't handle the diverse inputs, for the function to "plow ahead regardless" branches (read: the dreaded runtime overhead) need to be introduced. The only difference being that those branches are inlined, instead of being checked at the moment the function is called.\$\endgroup\$
      – HernanATN
      CommentedJan 26 at 22:52
    • \$\begingroup\$@Reinderien I understand and respect your opinion regarding function overloading in Python, but Alternative 1 just seems like a worse version of this pattern (that suffers from all the drawbacks you've mentioned, some that the library submitted for review tryes to eliminate/mitigate) and Alternative 2 is "just don't do it", which I feel exists somewhere outside the implementation of said it.\$\endgroup\$
      – HernanATN
      CommentedJan 26 at 22:55
    • 5
      \$\begingroup\$@HernanATN, thank you very much for adding that Products example. It makes a good argument. But, alas, you can't win them all, I am unconvinced. I would rather use standard tooling with def product_from_id and def product_from_sku, it just seems like the natural approach to the business problem. Reinderien did an admirable job of keeping an open mind, and I agree with all of his analysis.\$\endgroup\$
      – J_H
      CommentedJan 27 at 4:32

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.