Skip to content

Race condition using schema configuration when caching in multi-threaded scenarios #744

@costas80

Description

@costas80

Class SchemaValidatorsConfig is currently used to define all configuration properties for validation before a JsonSchema instance is loaded. This is set on the schema:

  • When the schema is initialised.
  • If schema caching is enabled, once the schema has been retrieved from the cache.

In the second case, the configuration is set on the schema in case a different configuration was used when the schema was first initialised. This works fine if, using caching, we execute multiple validations sequentially, but is not correct in multi-threaded environments where we have parallel validations (for example a web application). The problem comes from setting the configuration on the cached schema itself which means that it may be changed by one thread while another thread is executing a validation.

As an example of the problem consider a web application using schema caching that allows users to choose the language for their JSON validation report between EN and FR. The following steps take place:

  1. Bob selects EN and triggers a validation. The schema is loaded for the first time with EN as its language and cached.
  2. Bob's validation begins and starts collecting EN validation messages.
  3. Before Bob's validation completes, Alice triggers a validation in FR. The same schema is loaded from the cache and FR is set as the language of choice.
  4. Bob's validation continues to produce messages, but given step 3, these are now in FR.

In brief, the issue is one of a shared concurrent cache with mutable state leading to race conditions. The current workaround for this is to either create a new JsonSchemaFactory per validation, or use a shared one with caching disabled.

To correct the problem and fully support caching we could follow one of two approaches:

  • Option A: When a JsonSchema instance is loaded from the cache, clone it in-depth before returning it, and set the configuration on the clone.
  • Option B: Use a ThreadLocal to set and read the SchemaValidatorsConfig instance.

From the two options above, using a ThreadLocal would be the simplest and is the approach used also elsewhere for similar concerns (see CollectorContext).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions