CodeQL documentation

Deserialization of user-controlled data

ID: py/unsafe-deserialization Kind: path-problem Security severity: 9.8 Severity: error Precision: high Tags: - external/cwe/cwe-502 - security - serialization Query suites: - python-code-scanning.qls - python-security-extended.qls - python-security-and-quality.qls 

Click to see the query in the CodeQL repository

Deserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code.

There are many different serialization frameworks. This query currently supports Pickle, Marshal and Yaml.

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON.

If you need to use YAML, use the yaml.safe_load function.

Example

The following example calls pickle.loads directly on a value provided by an incoming HTTP request. Pickle then creates a new value from untrusted data, and is therefore inherently unsafe.

fromdjango.conf.urlsimporturlimportpickledefunsafe(pickled):returnpickle.loads(pickled)urlpatterns=[url(r'^(?P<object>.*)$',unsafe)]

Changing the code to use json.loads instead of pickle.loads removes the vulnerability.

fromdjango.conf.urlsimporturlimportjsondefsafe(pickled):returnjson.loads(pickled)urlpatterns=[url(r'^(?P<object>.*)$',safe)]

References

close