0

Where can I find RegEx that can pattern match common secret strings?

I have a product that scans repos and commits and in case a developer tries to commit a secret (i.e. passwords, keys). It scans for roughly 30 patterns by default which seems insufficient given thousands of repos in over seventy languages. I can expand that scanning with RegEx. However, I don't know every common secret there is.

Is there a framework, list, or tool that can provide RegEx or patterns for likely secrets?

Where can I get comprehensive lists of secret types?

Or am I doomed to writing a metric ton of RegEx then being held responsible for when something is missed?

8
  • "Find all the secrets!!" is not a specification. You are tying the impact of missing a secret to the specification of the secrets you need to find. Until you can define what you are looking for, then you are looking for everything. So, this looks like an XY Problem. What's the problem that you are trying to solve?
    – schroeder
    CommentedAug 21, 2020 at 13:10
  • Agreed. I was just hoping that there was an existing corpus of well researched secret text patterns. Or maybe a framework has already defined what I ought to be looking for as a best practice?CommentedAug 21, 2020 at 13:13
  • 2
    How does one create a regex for what should be random strings?
    – schroeder
    CommentedAug 21, 2020 at 13:42
  • 1
    Right, but they have defined what they are looking for. You are, by your statement, looking for "every common secret there is". If you know your infrastructure and product, then you should be able to refine your search to the secrets that are likely to be in code/configs your devs might commit.
    – schroeder
    CommentedAug 21, 2020 at 14:03
  • 2
    @QuantenGhost (a) "Are any AppSec vendors discovering secret patterns "in the wild" and then publishing their research" if they are AppSec vendors, they probably won't be publishing the IP they discover, otherwise they'll be less able to continue selling their expertise. (b) "then being held responsible for when something is missed?" You will never be able to reliably discover every instance of a secret: all you can aim to do is catch as many as possible.CommentedAug 21, 2020 at 15:09

1 Answer 1

1

There are some existing published regular expression patterns which try to detect some categories of secrets. These usually rely on the standardized structure of private keys and tokens provided by concrete implementations.

See for example gitleaks.

Some patterns which are already available:

  • For matching AWS secret keys: (?i)aws(.{0,20})?(?-i)['\"][0-9a-zA-Z\/+]{40}['\"]

  • For matching a few types of asymmetric private keys: -----BEGIN ((EC|PGP|DSA|RSA|OPENSSH) )?PRIVATE KEY( BLOCK)?-----

Of course, there is no guarantee that all secrets will be found. It may be worth considering an alternative solution: education. By making developers more aware of the dangers of publishing secrets to repositories, they are less likely to commit such mistakes.

    You must log in to answer this question.

    Start asking to get answers

    Find the answer to your question by asking.

    Ask question

    Explore related questions

    See similar questions with these tags.