Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Query fails with "DeserializationError: Object contains forbidden prototype property" when the response object has a "constructor" key #1408

Closed
diegomansua opened this issue Mar 4, 2021 · 19 comments · Fixed by #1414

Comments

@diegomansua
Copy link

diegomansua commented Mar 4, 2021

🐛 Bug Report

Query fails with DeserializationError: Object contains forbidden prototype property.

This started happening when we upgraded from v7.3 to v7.11. This PR seems to be the cause: #1110

To Reproduce

Issue a query that results in constructor being an object key in the JSON response. In our case it was detected in a request to the Term Vectors API, but I guess it can happen in many other places, like in an aggregation named constructor, etc. Example from the Term Vectors API response that affected us:

{
  "constructor": {
    "doc_freq": 1,
    "ttf": 1,
    "term_freq": 1
  }
}

See #1110 (comment) for a full example.

Expected behavior

No errors thrown from a perfectly valid response.

Your Environment

  • node version: 14
  • @elastic/elasticsearch version: 7.11
  • os: Linux
@delvedor
Copy link
Member

delvedor commented Mar 8, 2021

Hello! The JSON parser aggressively blocks every object that might expose you to a prototype pollution attack.
You can disable this behavior by overriding the default serializer:

const sjson = require('secure-json-parse')
const { Client, Serializer, errors } = require('@elastic/elasticsearch')

class InsecureSerializer extends Serializer {
  deserialize (json) {
    let object
    try {
      object = sjson.parse(json, { constructorAction: 'ignore' })
    } catch (err) {
      throw new errors.DeserializationError(err.message, json)
    }
    return object
  }
}

const client = new Client({
  node: 'http://localhost:9200',
  Serializer: InsecureSerializer
})

See secure-json-parse for all the available options.
If possible, I would recommend avoiding using any dangerous JavaScript keys, such as constructor, prototype, and __proto__.
Recommended read: Square Brackets are the Enemy.

@diegomansua
Copy link
Author

Thanks for your reply @delvedor. We're aware of the workaround (providing a custom serializer, although we actually went back to using the standard JSON.parse that was in the previous version of the lib we were using, 7.3).

I've read those two links you've provided but what I think is missing here (or rather, in the original PR) is an example of how this could be an attack vector in ElasticSearch, especially when this is such an aggressive change that breaks perfectly working/valid/inocuous responses from ES (and even worse, it can randomly start happening in production at any time) and it's been made the default behaviour. This is why I personally consider it a bug, or at the very least, a breaking change (that wasn't flagged as such).

The examples in those links seem to talk about specific actions (merge/clone) when dealing with arbitrary input from external sources, rather than simply parsing a response from what I imagine in most cases is a trusted internal ElasticSearch instance.

If possible, I would recommend avoiding using any dangerous JavaScript keys, such as constructor, prototype, and proto.

This is not an option in our case. The example my colleague @matAtWork provided was just a simple way to reproduce, but when dealing with the actual use case that broke for us (Term Vectors API) the only way to avoid it would be to prevent the word constructor from being in our corpus (i.e. removing it before storing the data in ES), which is just not an option.

@delvedor
Copy link
Member

delvedor commented Mar 8, 2021

is an example of how this could be an attack vector in ElasticSearch

You a right, an example could help.
This attack could happen in numerous ways, it really depends on your application. It can be a logging use case, where your log contains the malicious object is sent to Elasticsearch, or you save in Elasticsearch some data without running any validation upfront.
The main issue with prototype pollution attacks is that you will not notice that it's happening until it's too late.

Given the enormous risks that an attack like this one could expose you, we have preferred to make this option opt-out and offer the highest security level to this library's users. I'll update the documentation to make this clearer and show how to opt-out of this behavior if you can't possibly do otherwise.

although we actually went back to using the standard JSON.parse that was in the previous version of the lib we were using, 7.3).

I strongly recommend updating your code with the code I've posted above, as it will partly protect you against this attack but still allowing your use case.

@matAtWork
Copy link

@delvedor - I think you have failed to understand the concept of "Separation of concerns".

JSON is used in ES as a transport format - any security and resilience issues (for example fields that are sensitive or contain passwords) are not a concern of the transport, and not addressed by your library.

Where are the "permitted" values documented? How does a developer know that certain data elements are "dangerous"? What about future-proofing - will more discarded values be added in the future, potentially breaking production systems in unknowable and unpredictable ways?

In order to avoid a theoretical issue, you have introduced a breaking change in production systems. The issue highlighted by the links you have posted, which refer to downstream code issues - it is not the deserialziation of the object that is a risk, but what one does with those objects (eg merge). For example, the line:

 Object.getPrototypeOf(JSON.parse('{"__proto__": { "polluted":true }}'))

...works exactly as expected - the properties can be referenced exactly as defined:

> JSON.parse('{"__proto__": { "polluted":true }}').__proto__
{ polluted: true }

The "error" you describe is to use the result of the JSON.parse call inappropriately. This is an application layer concern, not the concern of the transport at all.

This is a bad fix for the issue of prototype pollution and breaks perfectly legitimate code, in fact it precludes storing arbitrary data in ES which is impossible to create test cases for as the names and values are buried in a 3rd party library.

It should not be a default, and since ES allows developers can specify their own Serializer classes (and JSON reviver, which is better way to implement this in any case), for those projects where this may be a concern, they can use it if they wish.

Making correct ES operation dependent on the data contained is a very retrograde step.

@matAtWork
Copy link

In fact, I note that in https://medium.com/node-modules/what-is-prototype-pollution-and-why-is-it-such-a-big-deal-2dd8d89a93c the fix is described as a change to the merge routine - the point at which the application tries to use the transported data incorrectly. Similarly, in "Square brackets are the enemy" all the risks are described as application failures to understand how JavaScript treats certain keys, not as a serialization issue with JSON, which although superficially similar to JavaScript, defines no special meanings to individual keys. This is why we stopped using eval(...) a decade ago to parse JSON, and moved to JSON.parse(...).

However, JSON.parse does it correctly: in the example above JSON.parse('{"__proto__": { "polluted":true }}') produces an object with a "normal" JavaScript prototype and a property called __proto__ that is just a POJS object.

@delvedor
Copy link
Member

delvedor commented Mar 8, 2021

Hi @matAtWork, thank you for sharing your perspective.

You are right when you say that JSON.parse is safe, the problem is that it can produce unsafe objects.
I can't entirely agree with your assessment, because as I said before, catching prototype pollution/poisoning attacks is very hard and you can find yourself in bad situations.
The safest way to avoid these situations is to catch them as early as possible in their "point of entrance", which in this case is the json parsing which this library is in charge of.

I agree that there could have been better communication around this change, and I apologize for it. The documentation will be updated to address this point, but the library default will not change, especially because there is an official way to override the default secure behavior.

@matAtWork
Copy link

I can't entirely agree with your assessment, because as I said before, catching prototype pollution/poisoning attacks is very hard and you can find yourself in bad situations.

Even if I were to agree the issue is in merge, etc., not in JSON implementation of deserialization. Your "fix" is not a fix at all as it is in the wrong place.

You have fixed attempted an upstream fix. I would not, for example, suggest the way to "fix" ES security would be to junk http in favour of https, as it would break loads of installations. The correct solution to insecure connections would be to operate within a secure network/VPN or utilize https, not to unilaterally declare "http" insecure and simply remove it. I could come up with many other examples of where attempts at upstream fixes, while convenient, create endless legacy and edge cases as, to be frank, you have not "fixed" the issue.

I cannot understand why you could countenance a breaking change when many, many users will have handled the issue you believe you have "fixed" in the correct place.

This change should be reverted as a default - it is an on-going risk to production systems. However, in our case we use a working Serializer, and are prepared to take that route, even tho it means having to re-apply for subsequent releases

@matAtWork
Copy link

Perhaps a compromise would be to issue a warning of some description about a "Potential prototype pollution", rather than abort the operation.

For example, an SQL DB that threw an exception on select * from data group by keyword if keyword were one of a set of undisclosed values would be considered flawed.

@delvedor
Copy link
Member

delvedor commented Mar 8, 2021

However, in our case we use a working Serializer, and are prepared to take that route, even tho it means having to re-apply for subsequent releases

I'm not sure what you mean with "re-apply for subsequent releases", injecting a custom serializer class is part of the client's public API, and it's the recommended way to do such a thing.

Perhaps a compromise would be to issue a warning of some description about a "Potential prototype pollution", rather than abort the operation.

It could, but the performance cost would be too high, as you must analyze the string before parsing it.
If you would like to implement this yourself with a custom serializer, you can use the .scan API of secure-json-parse.

As said, the client's default won't change, but we'll do a better job at documenting it and show how to change it. Thank you for your feedback!.

@diegomansua
Copy link
Author

diegomansua commented Mar 8, 2021

I really think this is the wrong approach here, and that the potential vulnerability that tries to avoid is completely out of the scope of a client lib like this. I wonder if popular requests libraries like node-fetch or axios would throw similar errors when calling an API with constructor as key in the response (or maybe the browsers standard fetch API?). Or a database client library failing to query a table with a column named constructor. I'd be puzzled, just like I was (am) when I came across this. Quite disappointing.

@matAtWork
Copy link

@delvedor In all honesty, I think you've failed to objectively assess this and certainly to assess the potential impacts in the future. Do the other clients filter out these fields/data values? What about all the other clients described in https://www.elastic.co/guide/en/elasticsearch/client/index.html ?

Is the JS implementation really going to fail where PHP, Java, Ruby, Perl, .Net, etc all succeed?

I think, as a contributor to secure-json-parse you only see the upside in your applications, but this now makes the JS library an edge case compared to the other clients, and ES itself. If you feel so strongly, why does ES even index these values and not protect us unwitting devs from themselves?

@delvedor
Copy link
Member

delvedor commented Mar 8, 2021

Do the other clients filter out these fields/data values? What about all the other clients described in https://www.elastic.co/guide/en/elasticsearch/client/index.html ?

This is a JavaScript-specific vulnerability, and there is no reason for other languages to act in this case.

If you feel so strongly, why does ES even index these values and not protect us unwitting devs from themselves?

This is a JavaScript-specific vulnerability, and there is no reason for Elasticsearch to act in this case.

Security is a concern that should be addressed at all levels. Once you are breached, there is no going back. Provide secure defaults is what guarantees to avoid security vulnerabilities and massive headaches afterward.
Not agreeing with a default is perfectly fine, and this library is designed to allow users to change the default behavior to what works best for them.

The only thing that could be done here is to offer an option to make it even easier to disable this security check, such as disablePrototypePoisoningProtection and have as possible values 'proto', 'constructor', or a boolean.

@matAtWork
Copy link

You are quite right, this is a JavaScript vulnerability, and as such is the responsibility of the JavaScript developer to understand and mitigate. It is not the responsibility of the transport. Even if it were, a "fix" that breaks production systems is a very poor fix.

Additionally, if it is so vital to the JS community, why is it not in Kibana?
Screenshot 2021-03-09 at 09 44 55
This complicates things even more as the key Elasticsearch tool for monitoring and managing clusters doesn't behave the same way as the default library.

Incidentally, the PR in #1412 does not correctly describe (or rectify) the issue:

In most cases, you don’t need to care about this, but if the object stored in Elasticsearch contains either the __proto__ or constructor keys, the deserialization phase will throw an exception.

It is not just if the object contains those values, but if the data does. In that case, mtermvectors can (and does) fail. I'm afraid this indicates you have not fully understood the ramifications of #1110 due to the way you are using ES in your applications.

I would be grateful if the wider community could comment.

@delvedor
Copy link
Member

delvedor commented Mar 9, 2021

and as such is the responsibility of the JavaScript developer to understand and mitigate

Definitely! But this would also mean that you would see dozens of security breaches on the internet because security is hard and easy to get wrong.
If a library can help a user not fall into known security issues, I don't see why it shouldn't.

It is not the responsibility of the transport.

I've understood your point, and as I've said, this library made some decisions, if you don't agree with them, you can easily override them, as the library is designed to allow you to do so.

Additionally, if it is so vital to the JS community, why is it not in Kibana?

This fix is in Kibana, the example you have shown won't trigger the error as it happens only if the constructor key is a top-level key in the object.


I agree that the secure-json-parse is too aggressive at handling the constructor key, I'm working on a pr that will throw an exception only if the constructor has a child named prototype.

@matAtWork
Copy link

This fix is in Kibana, the example you have shown won't trigger the error as it happens only if the constructor key is a top-level key in the object.

Is this safe? For example:

const obj = JSON.parse('{
   "nest":{
      "constructor":{ ... }
   }
}');

const polluted = merge({}, obj.nest);

...why does the constructor key have to be at the top level? The bad app developer in this case is merging from obj.nest, which illustrates why this upstream fix isn't in fact a fix

@delvedor
Copy link
Member

Hello! I didn’t express myself well, let me rectify my message:
The secure json parser will throw an exception for nested dangerous keys as well, which means it will also cover you for the example you have shown.

Thank you for your feedback, I’ll prepare a pull request that adds a disablePrototypePoisoningProtection option to opt-out from this behavior or disable one of the two checks (__proto__ vs. constructor.prototype).
Furthermore, I’ve already sent a pull request to the json parser to be more strict and not throw an exception when it finds the constructor key, but only if the constructor is an object with a prototype key as well.

@matAtWork
Copy link

I appreciate you looking at this complex security issue. I remain to be convinced that this is the correct place to attempt to enforce security and appreciate supported option to disable it, but I still don't see how it works in Kibana:

POST /.tasks/_search
{
  "size":0,
  "aggs": {
    "constructor": {
      "terms": {
        "field":"xxx"
      }
    },
    "__proto__": {
      "terms": {
        "field":"xxy"
      }
    }
  }
}

...produces:

{
  "took" : 55,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 10000,
      "relation" : "gte"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "__proto__" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ ]
    },
    "constructor" : {
      "doc_count_error_upper_bound" : 0,
      "sum_other_doc_count" : 0,
      "buckets" : [ ]
    }
  }
}

Is this response considered safe?

@matAtWork
Copy link

Can I ask why this has been closed? The issue still exists, and you have not explained why the above response is acceptable, given a code sequence such as const aggs = Object.assign({}, response.body.aggregations) can be considered "safe", even though it is nested within the top level object.

#1414 does not, alone, fix the above security risk or bring the JS library into line with all the other client libraries. Additionally, it brings a false sense of security as it obviates the need for good security practices by the application developer by making some (top-level?) objects "safe", but not others.

Why was there no review of this PR by other team members?

@delvedor
Copy link
Member

Hello! The dev tools UI in Kibana is not using the client, but a plain HTTP proxy.
The json parser does a full object traversal to detect dangerous keys, and has been updated to be more strict and only throw an error if the constructor key is an object which contains a prototype key.
The option to disable more easily the prototype poisoning protection introduced in #1414 will land in the next minor of the client, but you can already disable it by injecting a custom serializer as shown above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants