-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Java] Document fixes for deserialization vulnerabilities by framework #11700
[Java] Document fixes for deserialization vulnerabilities by framework #11700
Conversation
c74e10b
to
2efb5e9
Compare
<tr> | ||
<td>Kryo</td> | ||
<td>com.esotericsoftware:kryo and com.esotericsoftware:kryo5</td> | ||
<td>com.esotericsoftware:kryo versions after 5.0 Yes; com.esotericsoftware:kryo5 Yes</td> | ||
<td>Don't call <code>com.esotericsoftware.kryo(5).Kryo#setRegistrationRequired</code> with the argument <code>false</code>.</td> | ||
</tr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some research on this and Kryo is no longer vulnerable by default. It seems that sometime between the end of the 4.x release line and the latest release this was fixed to be secure by default.
Currently, the Kryo query still present in CodeQL checks for this incorrectly.
CodeQL currently attempts to detect calls to setRegistrationRequired(true)
for the use case to be safe. But actually, the library currently requires the user to call setRegistrationRequired(false)
to be vulnerable. The logic in CodeQL needs to be inverted to be correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm trying to track down the exact version this was fixed. I've reached out to the maintainers here: EsotericSoftware/kryo#929
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Confirmed with the maintainer that versions 5.0.0 and later are all secure-by-default.
<tr> | ||
<td>SnakeYAML</td> | ||
<td>org.yaml:snakeyaml</td> | ||
<td><a href="https://bitbucket.org/snakeyaml/snakeyaml/wiki/CVE%20&%20NIST.md">No</a>. <a href="https://bitbucket.org/snakeyaml/snakeyaml/issues/561/cve-2022-1471-vulnerability-in">Maintainer response</a>.</td> | ||
<td>Instantiate the <code>org.yaml.snakeyaml.Yaml</code> instance explicitly with an instance of <code>org.yaml.snakeyaml.constructor.SafeConstructor</code> as an argument.</td> | ||
</tr> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This has been my pet project the past two weeks. I'm still working to convince the maintainer to make this secure by default. If this does end up happening, I'll update this documentation.
e6a9c6b
to
4e60f89
Compare
@JLLeitschuh we would rather not have specific research re: when and how particular libs are vulnerable in qhelp files -- it would be better to get documentation such as this published by a security research organisation (e.g. integrated into https://cheatsheetseries.owasp.org/cheatsheets/Deserialization_Cheat_Sheet.html ?), then cite that from our qhelp. |
This seems like a contradiction. All of the QHelp files have bad and good examples. How is this any different? This knowledge is currently heavily scattered across the internet and very difficult to track down. Also, the current CodeQL deserialization QHelp is resulting in incorrect understandings by end users. As an example, the Oracle team incorrectly understood the SnakeYaml vulnerability due to CodeQLs documentation. |
@JLLeitschuh I think the goal is that qhelp should contain a general example of the class of vulnerability and how it should be handled, but not a library-by-library how-to guide, since the latter will get out of date and is better hosted by an organisation that's particularly in that business, e.g. OWASP. I think you should get it hosted by a reputable authority and then link it from our qhelp using phrasing like "...for specific advice about how to secure a particular library, see..." |
This is exactly what CodeQL already does. As QL query authors, you have to encode in the query what "safe" means, otherwise the query will issue false positives all over the place. The problem here, is that that knowledge that is encoded in the query, is not captured in the documentation associated with the query. This seems wrong. As an end user, why am I getting this alert? If you have two uses of some chunk of code in your codebase, why is one getting flagged, and the other isn't? Most likely, this is because of intentional filtering encoded into the ql query. Even as someone who writes ql, it can be quite complicated to figure out why one chunk of code is/isn't being flagged.
GitHub has made itself this reputable authority by flagging code as vulnerable. It is the responsibility of the query authors to be able to explain, both in CodeQL, and in the documentation that query flags, why the vulnerability was flagged. CodeQL (either through the documentation or through the query results) should be able to communicate this per-library. TL;DR: GitHub and CodeQL is already encoding this information in the query, even if it is a little out of date. If CodeQL has the responsibility of flagging a vulnerability, the documentation should be able to adequately describe why it is flagging it as vulnerable. |
If you look at the queries, this is often because the user has/hasn't done something to make their use case vulnerable. That information about what they need to do/haven't done is essential to actually fixing the vulnerability. Again, CodeQL is capturing this in the query, but isn't articulating this to the end user. To provide a concrete example: codeql/java/ql/lib/semmle/code/java/security/UnsafeDeserializationQuery.qll Lines 133 to 142 in e629568
CodeQL has already encoded in the query that this is a safe way to do object deserialization, but the documentation doesn't reflect this. Another example: codeql/java/ql/lib/semmle/code/java/security/UnsafeDeserializationQuery.qll Lines 181 to 189 in e629568
CodeQl has included in the query a search for the code that makes Jackson vulnerable, but this also doesn't get captured anywhere in the documentation. I get that it's difficult to keep documentation and code in sync. If that's truly the problem then, maybe the why should be more explicitly communicated in the query results. But it should be somewhere. Without this why end users are far more likely to simply consider the result a false positive. They can't be expected to do extensive research on every vulnerability. As an example, I've spent the past week learning about deserialization vulnerabilities and the various gadget chains that exist. If I were able to wave a magic wand, I would completely re-write this query so that every framework is it's own query result and each framework is documented independently. As an end-user this would be the most clear because the documentation would be more narrowly focused on my particular use case. |
QHelp previews: java/ql/src/Security/CWE/CWE-502/UnsafeDeserialization.qhelpDeserialization of user-controlled dataDeserializing untrusted data using any deserialization framework that allows the construction of arbitrary serializable objects is easily exploitable and in many cases allows an attacker to execute arbitrary code. Even before a deserialized object is returned to the caller of a deserialization method a lot of code may have been executed, including static initializers, constructors, and finalizers. Automatic deserialization of fields means that an attacker may craft a nested combination of objects on which the executed initialization code may have unforeseen effects, such as the execution of arbitrary code. There are many different serialization frameworks. This query currently supports Kryo, XmlDecoder, XStream, SnakeYaml, JYaml, JsonIO, YAMLBeans, HessianBurlap, Castor, Burlap, Jackson, Jabsorb, Jodd JSON, Flexjson, Gson and Java IO serialization through RecommendationAvoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder. Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures. Recommendations specific to particular frameworks supported by this query: FastJson -
FasterXML -
Kryo -
ObjectInputStream -
SnakeYAML -
XML Decoder -
ExampleThe following example calls public MyObject {
public int field;
MyObject(int field) {
this.field = field;
}
}
public MyObject deserialize(Socket sock) {
try(ObjectInputStream in = new ObjectInputStream(sock.getInputStream())) {
return (MyObject)in.readObject(); // unsafe
}
} Rewriting the communication protocol to only rely on reading primitive types from the input stream removes the vulnerability. public MyObject deserialize(Socket sock) {
try(DataInputStream in = new DataInputStream(sock.getInputStream())) {
return new MyObject(in.readInt());
}
} References
|
I made some minor changes so that the query help file renders correctly, I hope you don't mind @JLLeitschuh. That should help with the review to determine whether to accept this. |
@JLLeitschuh thanks for gathering this data and creating this PR. The team and I discussed and we agree that this is helpful information for developers to have when trying to understand what the problem is and how to fix it. At the moment the way in which the table is formatted is creating horizontal scrolling and hiding some essential information from the user. Is it possible to try a format that requires less space horizontally? We'll also plan some adjustments to avoid showing findings for libraries which are secure by default that developers will report as False Positives. |
Hi All, I'm on vaacation until the 28th so won't have a response/fix until then. Do you have any suggestions for alternative formatting? I'm open to anything. Unfortunately, AFAIK, there is still no way for external contributors to preview QHelp files in their rendered format. If this is now possible, please let me know and I'm happy to do some iterating to see if I can come up with something |
There is a bot response that shows how the message would look like: Clicking the twisty should reveal the rendered response. I'll reach out to the internal documentation team and see if they have a suggestion for us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some style suggestions, independent of the question of table formatting
<td>Yes</td> | ||
<td> | ||
Don't call <code>com.fasterxml.jackson.databind.ObjectMapper#enableDefaultTyping</code> and don't annotate any object fields with <code>com.fasterxml.jackson.annotation.JsonTypeInfo</code> passing either the <code>CLASS</code> or <code>MINIMAL_CLASS</code> values to the annotation. | ||
Read <a href="https://cowtowncoder.medium.com/jackson-2-10-safe-default-typing-2d018f0ce2ba">this guide</a>. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we find a more authoritative source than a Medium post?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cowtowncoder is the primary maintainer of FasterJackson
Update: SnakeYaml is actually fixing the library so it's not insecure-by-default. 🎉 https://bitbucket.org/snakeyaml/snakeyaml/issues/561/cve-2022-1471-vulnerability-in |
@coadaflorin the links in the table don't seem to be showing up in the commented rendere. Is that an issue with the markdown generator? Also, is that comment live updating as the PR gets updated, or does someone from GitHub need to have it get regenerated? |
I wrote a sample here with how we could turn that table into a list. I created a branch here. If you like this format, let's use it here. As for the generated view, it's automatically by an action generated when a pull request is triggered against a Here's how the As a listRecommendationAvoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder. Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures. Recommendations specific to particular frameworks supported by this query: FastJson -
FasterXML -
Kryo -
ObjectInputStream -
SnakeYAML -
XML Decoder -
As a tableRecommendationAvoid deserialization of untrusted data if at all possible. If the architecture permits it then use other formats instead of serialized objects, for example JSON or XML. However, these formats should not be deserialized into complex objects because this provides further opportunities for attack. For example, XML-based deserialization attacks are possible through libraries such as XStream and XmlDecoder. Alternatively, a tightly controlled whitelist can limit the vulnerability of code, but be aware of the existence of so-called Bypass Gadgets, which can circumvent such protection measures. Recommendations specific to particular frameworks supported by this query:
|
I think I'm inclined to go with the list format you proposed. Do you want to create a PR against my branch and I'll merge it into this PR? |
I think I managed to create the correct PR here |
It looks like your PR has conflicts as-is. Can you resolve those? After that, I'm happy to merge it |
Looks like the list is in, I'm all good on my side. |
Move the table under <recommendation>, minor fixes.
Co-authored-by: Chris Smowton <smowton@github.com>
ce816c2
to
4c1c12d
Compare
Related #11603