[Java] Document fixes for deserialization vulnerabilities by framework #11700

JLLeitschuh · 2022-12-14T19:37:08Z

JLLeitschuh · 2022-12-14T19:41:23Z

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

+        <tr>
+            <td>Kryo</td>
+            <td>com.esotericsoftware:kryo and com.esotericsoftware:kryo5</td>
+            <td>com.esotericsoftware:kryo versions after 5.0 Yes; com.esotericsoftware:kryo5 Yes</td>
+            <td>Don't call <code>com.esotericsoftware.kryo(5).Kryo#setRegistrationRequired</code> with the argument <code>false</code>.</td>
+        </tr>


I did some research on this and Kryo is no longer vulnerable by default. It seems that sometime between the end of the 4.x release line and the latest release this was fixed to be secure by default.

Currently, the Kryo query still present in CodeQL checks for this incorrectly.

CodeQL currently attempts to detect calls to setRegistrationRequired(true) for the use case to be safe. But actually, the library currently requires the user to call setRegistrationRequired(false) to be vulnerable. The logic in CodeQL needs to be inverted to be correct.

I'm trying to track down the exact version this was fixed. I've reached out to the maintainers here: EsotericSoftware/kryo#929

Confirmed with the maintainer that versions 5.0.0 and later are all secure-by-default.

JLLeitschuh · 2022-12-14T19:43:01Z

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

+        <tr>
+            <td>SnakeYAML</td>
+            <td>org.yaml:snakeyaml</td>
+            <td><a href="https://bitbucket.org/snakeyaml/snakeyaml/wiki/CVE%20&amp;%20NIST.md">No</a>. <a href="https://bitbucket.org/snakeyaml/snakeyaml/issues/561/cve-2022-1471-vulnerability-in">Maintainer response</a>.</td>
+            <td>Instantiate the <code>org.yaml.snakeyaml.Yaml</code> instance explicitly with an instance of <code>org.yaml.snakeyaml.constructor.SafeConstructor</code> as an argument.</td>
+        </tr>


This has been my pet project the past two weeks. I'm still working to convince the maintainer to make this secure by default. If this does end up happening, I'll update this documentation.

smowton · 2022-12-16T11:18:56Z

@JLLeitschuh we would rather not have specific research re: when and how particular libs are vulnerable in qhelp files -- it would be better to get documentation such as this published by a security research organisation (e.g. integrated into https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html ?), then cite that from our qhelp.

JLLeitschuh · 2022-12-16T15:11:32Z

This seems like a contradiction. All of the QHelp files have bad and good examples. How is this any different?

This knowledge is currently heavily scattered across the internet and very difficult to track down. Also, the current CodeQL deserialization QHelp is resulting in incorrect understandings by end users. As an example, the Oracle team incorrectly understood the SnakeYaml vulnerability due to CodeQLs documentation.

https://www.websec.ca/publication/Blog/CVE-2022-21404-Another-story-of-developers-fixing-vulnerabilities-unknowingly-because-of-CodeQL

smowton · 2022-12-16T16:56:39Z

@JLLeitschuh I think the goal is that qhelp should contain a general example of the class of vulnerability and how it should be handled, but not a library-by-library how-to guide, since the latter will get out of date and is better hosted by an organisation that's particularly in that business, e.g. OWASP. I think you should get it hosted by a reputable authority and then link it from our qhelp using phrasing like "...for specific advice about how to secure a particular library, see..."

JLLeitschuh · 2022-12-16T17:17:04Z

I think the goal is that qhelp should contain a general example of the class of vulnerability and how it should be handled, but not a library-by-library how-to guide, since the latter will get out of date and is better hosted by an organization that's particularly in that business, e.g. OWASP.

This is exactly what CodeQL already does. As QL query authors, you have to encode in the query what "safe" means, otherwise the query will issue false positives all over the place. The problem here, is that that knowledge that is encoded in the query, is not captured in the documentation associated with the query. This seems wrong.

As an end user, why am I getting this alert? If you have two uses of some chunk of code in your codebase, why is one getting flagged, and the other isn't? Most likely, this is because of intentional filtering encoded into the ql query. Even as someone who writes ql, it can be quite complicated to figure out why one chunk of code is/isn't being flagged.

I think you should get it hosted by a reputable authority

GitHub has made itself this reputable authority by flagging code as vulnerable. It is the responsibility of the query authors to be able to explain, both in CodeQL, and in the documentation that query flags, why the vulnerability was flagged. CodeQL (either through the documentation or through the query results) should be able to communicate this per-library.

TL;DR: GitHub and CodeQL is already encoding this information in the query, even if it is a little out of date. If CodeQL has the responsibility of flagging a vulnerability, the documentation should be able to adequately describe why it is flagging it as vulnerable.

JLLeitschuh · 2022-12-16T17:28:58Z

the documentation should be able to adequately describe why it is flagging it as vulnerable.

If you look at the queries, this is often because the user has/hasn't done something to make their use case vulnerable. That information about what they need to do/haven't done is essential to actually fixing the vulnerability. Again, CodeQL is capturing this in the query, but isn't articulating this to the end user.

To provide a concrete example:

codeql/java/ql/lib/semmle/code/java/security/UnsafeDeserializationQuery.qll

Lines 133 to 142 in e629568

    
           exists(Method m | m = ma.getMethod() | 
        
             m instanceof ObjectInputStreamReadObjectMethod and 
        
             sink = ma.getQualifier() and 
        
             not exists(DataFlow::ExprNode node | 
        
               node.getExpr() = sink and 
        
               node.getTypeBound() 
        
                   .(RefType) 
        
                   .hasQualifiedName("org.apache.commons.io.serialization", "ValidatingObjectInputStream") 
        
             ) 
        
             or

CodeQL has already encoded in the query that this is a safe way to do object deserialization, but the documentation doesn't reflect this.

Another example:

codeql/java/ql/lib/semmle/code/java/security/UnsafeDeserializationQuery.qll

Lines 181 to 189 in e629568

    
           ma.getMethod() instanceof ObjectMapperReadMethod and 
        
           sink = ma.getArgument(0) and 
        
           ( 
        
             exists(UnsafeTypeConfig config | config.hasFlowToExpr(ma.getAnArgument())) 
        
             or 
        
             exists(EnableJacksonDefaultTypingConfig config | config.hasFlowToExpr(ma.getQualifier())) 
        
             or 
        
             hasArgumentWithUnsafeJacksonAnnotation(ma) 
        
           ) and

CodeQl has included in the query a search for the code that makes Jackson vulnerable, but this also doesn't get captured anywhere in the documentation.

I get that it's difficult to keep documentation and code in sync. If that's truly the problem then, maybe the why should be more explicitly communicated in the query results. But it should be somewhere.

Without this why end users are far more likely to simply consider the result a false positive. They can't be expected to do extensive research on every vulnerability. As an example, I've spent the past week learning about deserialization vulnerabilities and the various gadget chains that exist.

If I were able to wave a magic wand, I would completely re-write this query so that every framework is it's own query result and each framework is documented independently. As an end-user this would be the most clear because the documentation would be more narrowly focused on my particular use case.

github-actions · 2022-12-19T09:10:12Z

QHelp previews:

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

Deserialization of user-controlled data

Deserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code.

There are many different serialization frameworks. This query currently supports Kryo, XmlDecoder, XStream, SnakeYaml, JYaml, JsonIO, YAMLBeans, HessianBurlap, Castor, Burlap, Jackson, Jabsorb, Jodd JSON, Flexjson, Gson and Java IO serialization through ObjectInputStream/ObjectOutputStream.

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder.

Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures.

Recommendations specific to particular frameworks supported by this query:

FastJson - com.alibaba:fastjson

Secure by Default: Partially
Recommendation: Call com.alibaba.fastjson.parser.ParserConfig#setSafeMode with the argument true before deserializing untrusted data.

FasterXML - com.fasterxml.jackson.core:jackson-databind

Secure by Default: Yes
Recommendation: Don't call com.fasterxml.jackson.databind.ObjectMapper#enableDefaultTyping and don't annotate any object fields with com.fasterxml.jackson.annotation.JsonTypeInfo passing either the CLASS or MINIMAL_CLASS values to the annotation. Read this guide.

Kryo - com.esotericsoftware:kryo and com.esotericsoftware:kryo5

Secure by Default: Yes for com.esotericsoftware:kryo5 and for com.esotericsoftware:kryo >= v5.0.0
Recommendation: Don't call com.esotericsoftware.kryo(5).Kryo#setRegistrationRequired with the argument false on any Kryo instance that may deserialize untrusted data.

ObjectInputStream - Java Standard Library

Secure by Default: No
Recommendation: Use a validating input stream, such as org.apache.commons.io.serialization.ValidatingObjectInputStream.

SnakeYAML - org.yaml:snakeyaml

Secure by Default: No
Recommendation: Pass an instance of org.yaml.snakeyaml.constructor.SafeConstructor to org.yaml.snakeyaml.Yaml's constructor before using it to deserialize untrusted data.

XML Decoder - Standard Java Library

Secure by Default: No
Recommendation: Do not use with untrusted user input.

Example

The following example calls readObject directly on an ObjectInputStream that is constructed from untrusted data, and is therefore inherently unsafe.

public MyObject {
  public int field;
  MyObject(int field) {
    this.field = field;
  }
}

public MyObject deserialize(Socket sock) {
  try(ObjectInputStream in = new ObjectInputStream(sock.getInputStream())) {
    return (MyObject)in.readObject(); // unsafe
  }
}

Rewriting the communication protocol to only rely on reading primitive types from the input stream removes the vulnerability.

public MyObject deserialize(Socket sock) {
  try(DataInputStream in = new DataInputStream(sock.getInputStream())) {
    return new MyObject(in.readInt());
  }
}

References

OWASP vulnerability description: Deserialization of untrusted data.
OWASP guidance on deserializing objects: Deserialization Cheat Sheet.
Talks by Chris Frohoff & Gabriel Lawrence: AppSecCali 2015: Marshalling Pickles - how deserializing objects will ruin your day, OWASP SD: Deserialize My Shorts: Or How I Learned to Start Worrying and Hate Java Object Deserialization.
Alvaro Muñoz & Christian Schneider, RSAConference 2016: Serial Killer: Silently Pwning Your Java Endpoints.
SnakeYaml documentation on deserialization: SnakeYaml deserialization.
Hessian deserialization and related gadget chains: Hessian deserialization.
Castor and Hessian java deserialization vulnerabilities: Castor and Hessian deserialization.
Remote code execution in JYaml library: JYaml deserialization.
JsonIO deserialization vulnerabilities: JsonIO deserialization.
Research by Moritz Bechler: Java Unmarshaller Security - Turning your data into code execution
Blog posts by the developer of Jackson libraries: On Jackson CVEs: Don’t Panic — Here is what you need to know Jackson 2.10: Safe Default Typing
Jabsorb documentation on deserialization: Jabsorb JSON Serializer.
Jodd JSON documentation on deserialization: JoddJson Parser.
RCE in Flexjson: Flexjson deserialization.
Android Intent deserialization vulnerabilities with GSON parser: Insecure use of JSON parsers.
Common Weakness Enumeration: CWE-502.

atorralba · 2022-12-19T09:12:51Z

I made some minor changes so that the query help file renders correctly, I hope you don't mind @JLLeitschuh. That should help with the review to determine whether to accept this.

coadaflorin · 2022-12-20T09:50:47Z

@JLLeitschuh thanks for gathering this data and creating this PR. The team and I discussed and we agree that this is helpful information for developers to have when trying to understand what the problem is and how to fix it. At the moment the way in which the table is formatted is creating horizontal scrolling and hiding some essential information from the user. Is it possible to try a format that requires less space horizontally?

We'll also plan some adjustments to avoid showing findings for libraries which are secure by default that developers will report as False Positives.

JLLeitschuh · 2022-12-24T01:06:03Z

Hi All,

I'm on vaacation until the 28th so won't have a response/fix until then. Do you have any suggestions for alternative formatting? I'm open to anything. Unfortunately, AFAIK, there is still no way for external contributors to preview QHelp files in their rendered format. If this is now possible, please let me know and I'm happy to do some iterating to see if I can come up with something

coadaflorin · 2023-01-03T10:52:28Z

There is a bot response that shows how the message would look like:
#11700 (comment)

Clicking the twisty should reveal the rendered response. I'll reach out to the internal documentation team and see if they have a suggestion for us.

smowton

Some style suggestions, independent of the question of table formatting

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

smowton · 2023-01-03T11:30:47Z

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

+            <td>Yes</td>
+            <td>
+                Don't call <code>com.fasterxml.jackson.databind.ObjectMapper#enableDefaultTyping</code> and don't annotate any object fields with <code>com.fasterxml.jackson.annotation.JsonTypeInfo</code> passing either the <code>CLASS</code> or <code>MINIMAL_CLASS</code> values to the annotation.
+                Read <a href="https://cowtowncoder.medium.com/jackson-2-10-safe-default-typing-2d018f0ce2ba">this guide</a>.


Can we find a more authoritative source than a Medium post?

Cowtowncoder is the primary maintainer of FasterJackson

java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelp

JLLeitschuh · 2023-01-05T19:14:10Z

Update: SnakeYaml is actually fixing the library so it's not insecure-by-default. 🎉

https://bitbucket.org/snakeyaml/snakeyaml/issues/561/cve-2022-1471-vulnerability-in

JLLeitschuh · 2023-01-05T19:17:05Z

@coadaflorin the links in the table don't seem to be showing up in the commented rendere. Is that an issue with the markdown generator?

Also, is that comment live updating as the PR gets updated, or does someone from GitHub need to have it get regenerated?

coadaflorin · 2023-01-06T14:07:39Z

I wrote a sample here with how we could turn that table into a list. I created a branch here. If you like this format, let's use it here.

As for the generated view, it's automatically by an action generated when a pull request is triggered against a .qhelp file. I don't think the action is triggered when we make changes. I generated a markdown for the file I wrote manually using the CLI. Instruction on how to do that here

Here's how the Recommendation would look like as a list versus a table:

As a list

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder.

Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures.

Recommendations specific to particular frameworks supported by this query:

FastJson - com.alibaba:fastjson

Secure by Default: Partially
Recommendation: Call com.alibaba.fastjson.parser.ParserConfig#setSafeMode with the argument true before deserializing untrusted data.

FasterXML - com.fasterxml.jackson.core:jackson-databind

Secure by Default: Yes
Recommendation: Don't call com.fasterxml.jackson.databind.ObjectMapper#enableDefaultTyping and don't annotate any object fields with com.fasterxml.jackson.annotation.JsonTypeInfo passing either the CLASS or MINIMAL_CLASS values to the annotation. Read this guide.

Kryo - com.esotericsoftware:kryo and com.esotericsoftware:kryo5

Secure by Default: Yes for com.esotericsoftware:kryo5 and for com.esotericsoftware:kryo >= v5.0.0
Recommendation: Don't call com.esotericsoftware.kryo(5).Kryo#setRegistrationRequired with the argument false on any Kryo instance that may deserialize untrusted data.

ObjectInputStream - Java Standard Library

Secure by Default: No
Recommendation: Use a validating input stream, such as org.apache.commons.io.serialization.ValidatingObjectInputStream.

SnakeYAML - org.yaml:snakeyaml

Secure by Default: No
Recommendation: Pass an instance of org.yaml.snakeyaml.constructor.SafeConstructor to org.yaml.snakeyaml.Yaml's constructor before using it to deserialize untrusted data.

XML Decoder - Standard Java Library

Secure by Defauly: No
Recommendation: Do not use with untrusted user input.

As a table

Recommendation

Avoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder.

Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures.

Recommendations specific to particular frameworks supported by this query:

Project	Maven Coordinates	Secure by Default	Recommendation
XMLDecoder	Java Standard Library	No	Do not use with untrusted user input.
ObjectInputStream	Java Standard Library	No	Use a validating input stream, such as `org.apache.commons.io.serialization.ValidatingObjectInputStream`.
FastJson	com.alibaba:fastjson	Partially	Call `com.alibaba.fastjson.parser.ParserConfig\#setSafeMode` with the argument `true` before deserializing untrusted data.
SnakeYAML	org.yaml:snakeyaml	No (maintainer response)	Pass an instance of `org.yaml.snakeyaml.constructor.SafeConstructor` to `org.yaml.snakeyaml.Yaml`'s constructor before using it to deserialize untrusted data.
FasterXML jackson-databind	com.fasterxml.jackson.core:jackson-databind	Yes	Don't call `com.fasterxml.jackson.databind.ObjectMapper\#enableDefaultTyping` and don't annotate any object fields with `@JsonTypeInfo(CLASS) or @JsonTypeInfo(MINIMAL_CLASS)` if untrusted data may be deserialized. Read this guide.
Kryo	com.esotericsoftware:kryo and com.esotericsoftware:kryo5	com.esotericsoftware:kryo >= 5.0.0 and com.esotericsoftware:kryo5 Yes	Don't call `com.esotericsoftware.kryo(5).Kryo\#setRegistrationRequired` with the argument `false` on any `Kryo` instance that may deserialize untrusted data.

JLLeitschuh · 2023-01-06T16:04:26Z

I think I'm inclined to go with the list format you proposed. Do you want to create a PR against my branch and I'll merge it into this PR?

coadaflorin · 2023-01-09T12:59:41Z

I think I managed to create the correct PR here
Hope that helps :)

JLLeitschuh · 2023-01-09T16:01:14Z

It looks like your PR has conflicts as-is. Can you resolve those? After that, I'm happy to merge it

coadaflorin · 2023-01-10T10:07:41Z

Looks like the list is in, I'm all good on my side.

Related github#11603

Move the table under <recommendation>, minor fixes.

Co-authored-by: Chris Smowton <smowton@github.com>

JLLeitschuh requested a review from a team as a code owner December 14, 2022 19:37

github-actions bot added documentation Java labels Dec 14, 2022

JLLeitschuh force-pushed the doc/JLL/improve-java-unsafe-deserialization-documentation branch from c74e10b to 2efb5e9 Compare December 14, 2022 19:37

JLLeitschuh commented Dec 14, 2022

View reviewed changes

JLLeitschuh force-pushed the doc/JLL/improve-java-unsafe-deserialization-documentation branch 2 times, most recently from e6a9c6b to 4e60f89 Compare December 14, 2022 20:01

smowton reviewed Jan 3, 2023

View reviewed changes

JLLeitschuh and others added 4 commits January 10, 2023 11:18

[Java] Document fixes for deserialization vulnerabilities by framework

3fa11c2

Related github#11603

Update UnsafeDeserialization.qhelp

b7364f5

Move the table under <recommendation>, minor fixes.

Apply suggestions from code review

1d7881e

Co-authored-by: Chris Smowton <smowton@github.com>

suggestions in list format

4c1c12d

JLLeitschuh force-pushed the doc/JLL/improve-java-unsafe-deserialization-documentation branch from ce816c2 to 4c1c12d Compare January 10, 2023 16:18

atorralba previously approved these changes Jan 12, 2023

View reviewed changes

atorralba added the ready-for-doc-review This PR requires and is ready for review from the GitHub docs team. label Jan 12, 2023

Spelling

09d8a50

smowton dismissed atorralba’s stale review via 09d8a50 January 12, 2023 17:46

smowton approved these changes Jan 12, 2023

View reviewed changes

smowton merged commit 8aa2c23 into github:main Jan 12, 2023

JLLeitschuh deleted the doc/JLL/improve-java-unsafe-deserialization-documentation branch January 17, 2023 15:58

[Java] Document fixes for deserialization vulnerabilities by framework #11700

[Java] Document fixes for deserialization vulnerabilities by framework #11700

Uh oh!

Conversation

JLLeitschuh commented Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JLLeitschuh Dec 14, 2022

Choose a reason for hiding this comment

Uh oh!

JLLeitschuh Dec 14, 2022

Choose a reason for hiding this comment

Uh oh!

JLLeitschuh Dec 14, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JLLeitschuh Dec 14, 2022

Choose a reason for hiding this comment

Uh oh!

smowton commented Dec 16, 2022

Uh oh!

JLLeitschuh commented Dec 16, 2022

Uh oh!

smowton commented Dec 16, 2022

Uh oh!

JLLeitschuh commented Dec 16, 2022

Uh oh!

JLLeitschuh commented Dec 16, 2022

Uh oh!

github-actions bot commented Dec 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Deserialization of user-controlled data

Recommendation

Example

References

Uh oh!

atorralba commented Dec 19, 2022

Uh oh!

coadaflorin commented Dec 20, 2022

Uh oh!

JLLeitschuh commented Dec 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coadaflorin commented Jan 3, 2023

Uh oh!

smowton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

smowton Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

JLLeitschuh Jan 3, 2023

Choose a reason for hiding this comment

Uh oh!

Uh oh!

JLLeitschuh commented Jan 5, 2023

Uh oh!

JLLeitschuh commented Jan 5, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coadaflorin commented Jan 6, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Recommendation

Recommendation

Uh oh!

JLLeitschuh commented Jan 6, 2023

Uh oh!

coadaflorin commented Jan 9, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JLLeitschuh commented Jan 9, 2023

Uh oh!

JLLeitschuh commented Dec 14, 2022 •

edited

Loading

JLLeitschuh Dec 14, 2022 •

edited

Loading

github-actions bot commented Dec 19, 2022 •

edited

Loading

JLLeitschuh commented Dec 24, 2022 •

edited

Loading

JLLeitschuh commented Jan 5, 2023 •

edited

Loading

coadaflorin commented Jan 6, 2023 •

edited

Loading

coadaflorin commented Jan 9, 2023 •

edited

Loading