1
\$\begingroup\$

I tried to write a function that will hide a sensitive data from logs. But I am thinking about the optimizations. For example about reduction a nested loops. There are two options how a logs can pass in the function. First is as string, the second is as object that have a __dict__ where "args" located. For exaple in a log

 logger.debug( "%s Body: '%s'", self.id, self.body, ) 

args will contain a tuple of args: ("some id", "some body")

 def _hide_sensitive_data(record): to_hide = ["access_token", "password"] items = record.__dict__.get("args") record_list = list(items) try: for i, item in enumerate(record_list): for field in to_hide: if field in str(item): try: record_json = json.loads(item) record_json[field] = "*****" record_list[i] = record_json except json.JSONDecodeError: pattern = re.compile(r'(password=|access_token=)([^& ]+)') record_list[i] = re.sub(pattern, r'\2*****', item) record.__dict__["args"] = tuple(record_list) except TypeError: pass 

Can you give an advice how to optimize this code?

\$\endgroup\$
1
  • 2
    \$\begingroup\$We don't have much context here, but I would suggest that you are asking the wrong question. Instead of focusing, for example, on how to hide a password before it is emitted to a log, you should focus on redesigning the larger application to get entirely out of the business of shuttling passwords around from one function to another. Once you start down that fraught road, there are no good solutions.\$\endgroup\$
    – FMc
    CommentedMay 8, 2023 at 17:52

1 Answer 1

2
\$\begingroup\$

(Best advice how not to leak a secret is not to know it.)

re.sub(pattern, r'\2*****', item) does not replace group 2 by five stars:
it replaces the whole match with the top sensitive information in group 2, followed by five stars.
This suggests tests are in dire need.

  • With an "internal use" function, documentation of the interface does not look as mandatory as with a public one.
    Supply a documentation string just the same, for the maintenance programmer as well as for keeping a beneficial habit.
    (Style Guide for Python Code:

    Docstrings are not necessary for non-public methods, but you should have a comment that describes what the method does. This comment should appear after the def line.)

  • The introduction mentions more than one option to get information into a logging function. _hide_sensitive_data(record) handles one not looking basic -
    ponder one common handling.

  • pattern looks compiled from to_hide: make that explicit.
    This pattern needs to be compiled just when to_hide changes, and can be used with pattern.subn(), the second result being the number of substitutions.

I have an inkling that if more than one key to hide is matched in a single item, only the last value is substituted "via JSON": Test!

\$\endgroup\$
1
  • 1
    \$\begingroup\$(@FMc: The introductory hyperlink is to your own above comment … get entirely out of the business of shuttling passwords around … - that's somewhere?!)\$\endgroup\$
    – greybeard
    CommentedMay 10, 2023 at 4:49

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.