You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In last week's Jupyter Server meeting it was brought to our attention that the jsonschema package can exhibit a memory leak when the validate method is repeatedly called to validate an instance of a given schema. The leak isn't necessarily in the validation of the instance itself, but rather in the acquisition of the validator used to perform the validation and the validate method's documentation even states:
"If you know you have a valid schema already, especially if you intend to validate multiple instances with the same schema, you likely would prefer using the Validator.validate method directly on a specific validator (e.g. Draft7Validator.validate)."
Since Elyra validates all instances on retrieval (and save) and is intended to remain running for long periods of time, we should look into how this impacts Elyra vs. the cost to address this issue.
The text was updated successfully, but these errors were encountered:
When updating the MetadataManager.validate() to use a validator (Draft7Validator(schema, format_checker=draft7_format_checker).validate(metadata_dict)), the results become:
indicating a 20-30X decrease in memory usage, and a 6X increase in performance.
I think we can improve things further by letting the SchemaManager create the validator instances for each schema and cache those instances, as this will avoid the creation of the validator instance on each validation request.
Because Elyra is a long-lived application, and given these changes (so far) appear to be straightforward, I believe we should make these changes.
In last week's Jupyter Server meeting it was brought to our attention that the
jsonschema
package can exhibit a memory leak when thevalidate
method is repeatedly called to validate an instance of a given schema. The leak isn't necessarily in the validation of the instance itself, but rather in the acquisition of the validator used to perform the validation and thevalidate
method's documentation even states:"If you know you have a valid schema already, especially if you intend to validate multiple instances with the same schema, you likely would prefer using the Validator.validate method directly on a specific validator (e.g. Draft7Validator.validate)."
Since Elyra validates all instances on retrieval (and save) and is intended to remain running for long periods of time, we should look into how this impacts Elyra vs. the cost to address this issue.
The text was updated successfully, but these errors were encountered: