1
\$\begingroup\$

I need a code review for my implementation of my multidict in Python.

I'm not sure that my method below is implemented the best way:

def __delitem__(self, key): self._count -= len(self[key]) _ = super().pop(key) 

Link to GitHub

class MultiDict(dict): def __init__(self, *args, **kwargs): super().__init__(*args, **kwargs) self._count = 0 def __setitem__(self, key, value): try: _ = super().__getitem__(key) except KeyError: super().__setitem__(key, [value]) self._count += 1 else: super().__getitem__(key).append(value) self._count += 1 def __delitem__(self, key): self._count -= len(self[key]) _ = super().pop(key) def __len__(self): return self._count def __repr__(self): return f"{self.__class__.__name__}({super().__repr__()})" 
\$\endgroup\$
1

2 Answers 2

2
\$\begingroup\$

docstring

This class really needs a docstring.

Mitigating that a little, I truly appreciate the automated test suite. It has great educational value to someone considering making calls into this class.

There is a middle ground, of sorts, offered by python -m doctest *.py. It's no substitute for a nitty gritty test like test_repr(), which should stay where it is. But it lets you pick out the most common things a new user would be likely to do and to have questions about, and educate the user in a concise way. It's docs we can believe, even after edits to the library code, since the machine verifies their accuracy on every test run.

motivating example

The supplied test suite, while instructive, alas is very generic.

It would be helpful to show how, IDK, a Flask app, or some other Use Case, can solve a business problem using this library more simply than it could with a competing technique. (defaultdict(list) springs to mind.)

DRY

 super().__setitem__(key, [value]) self._count += 1 else: super().__getitem__(key).append(value) self._count += 1 

nit: There's an opportunity to follow this code with just a single unconditional increment.

temp var

These are just odd:

 _ = super().pop(key) ... _ = super().__getitem__(key) ... _ = super().pop(key) 

I imagine you found it convenient for print(_) debugging during early development? On the plus side, you're following convention, so linters won't flag it. Though we're more likely to see it in a tuple unpack context:
a, _, _, d = some_four_tuple

And kudos on insisting that, "_count is private! Go use len()!"

IP

The repo's README mentions "free to use", yet confusingly we do not see a LICENSE file. The repo currently is covered by "all rights reserved" copyright, since the documents were born under U.S. copyright law. Recommend you choose "MIT license" each time you create a new GitHub repository.

The code you chose to post on Stack Exchange, of course, is in quite a different category. You chose to publish it under CC BY-SA 4.0 terms.

documented semantics

constructor

I found this behavior confusing / unintuitive.

>>> d = MultiDict({'a': 1}) >>> d['a'] = 2 Traceback (most recent call last): File "<python-input-6>", line 1, in <module> d['a'] = 2 ~^^^^^ File "/tmp/multidict.py", line 13, in __setitem__ super().__getitem__(key).append(value) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ AttributeError: 'int' object has no attribute 'append' 

It wasn't obvious what a valid ctor argument would look like. Upon being presented with an invalid argument, it would be much much better to raise immediately than to let some delayed invocation blow up. In this example it's easily diagnosed because both call sites are adjacent, but in production code we often have many intervening lines of code, and often will have calls at different depths in the call stack, leading to a more challenging debug session for some hapless maintenance engineer.

deletion

I similarly found this confusing / unintuitive. Apparently insert & delete are not symmetric with each another. You had a different concept of what a MultiDict looks like than I did, partly due to a comparison I drew with a MultiSet. Writing some English sentences of documentation would help to align an app engineer's thinking with your own.

>>> d = MultiDict() >>> d['a'] = 1 >>> d['a'] = 2 >>> del d['a'] >>> del d['a'] Traceback (most recent call last): File "<python-input-23>", line 1, in <module> del d['a'] ~^^^^^ File "/tmp/multidict.py", line 17, in __delitem__ self._count -= len(self[key]) ~~~~^^^^^ KeyError: 'a' 

I thought the first del would delete the \$2\$ element and decrement count from \$2 \rightarrow 1\$, second del sends count from \$1 \rightarrow 0\$, and only an attempted third del would be invalid app-level code. But now I see you had a different concept in mind, OK.

Actually, having read a del test, now it makes perfect sense to me. Partly this is due to the documentation's focus on manipulating protocol headers, so del headers["Received"] leads to a different intuition.

To "shorten" an entry we would need to go outside what the OP code offers convenient access to. Though such mutation leaves _count in the wrong state.

>>> d MultiDict({'a': [1, 2]}) >>> vals = d['a'] >>> del vals[-1] >>> d['a'] [1] 
\$\endgroup\$
    0
    \$\begingroup\$

    Documentation

    The PEP 8 style guide recommends adding docstrings for classes. In the docstring, include:

    • Summary of what the class does
    • Motivation for re-inventing this wheel
    • Link to the well-known multidict documentation
    • Known deviations from multidict

    It would be handy to show examples of common usage, like in your test file in GitHub, plus how you would directly pass arguments to MultiDict (not shown in the test).

    __str__

    It would be nice to implement the __str__ function with some basic information about the data structure, like the length, etc.

    \$\endgroup\$

      Start asking to get answers

      Find the answer to your question by asking.

      Ask question

      Explore related questions

      See similar questions with these tags.