Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Mappings editor] Use Avj schema validator instead of io-ts #56272

Open
sebelga opened this issue Jan 29, 2020 · 4 comments
Open

[Mappings editor] Use Avj schema validator instead of io-ts #56272

sebelga opened this issue Jan 29, 2020 · 4 comments
Labels
enhancement New value added to drive a business result Feature:Mappings Editor Index mappings editor UI Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more

Comments

@sebelga
Copy link
Contributor

sebelga commented Jan 29, 2020

Currently, we use the io-ts lib to validate the configuration of the mapping (the complete object without the properties and dynamic_templates parameters).

As I was working on the fix to support mappings type in index template I struggled to make the io-ts schema strict, not allowing unknown properties to be declared. I realized it was not possible and had to use a hack to make it work.

If we validate the following mappings

js
const mappings = {
  dynamic: true,
  date_detection: true,
  numeric_detection: true,
  dynamic_date_formats: ['abc'],
  _source: {
    enabled: true,
    includes: ['abc'],
    excludes: ['abc'],
    unknownParam: 'hello', // should fail as it's unknown
  },
  properties: { title: { type: 'text' } },
  dynamic_templates: [],
  unknown: 123, // should fail as it's unknown
};

To make it fail with io-ts we need to

// Declare a schema

const mappingsConfigurationSchema = t.exact(
  t.partial({
    dynamic: t.union([t.literal(true), t.literal(false), t.literal('strict')]),
    date_detection: t.boolean,
    numeric_detection: t.boolean,
    dynamic_date_formats: t.array(t.string),
    _source: t.exact(
      t.partial({
        enabled: t.boolean,
        includes: t.array(t.string),
        excludes: t.array(t.string),
      })
    ),
    _meta: t.UnknownRecord,
    _routing: t.interface({
      required: t.boolean,
    }),
  })
);


// Create an array for _each_ (!) object we don't want unknown properties declared

const mappingsConfigurationSchemaKeys = Object.keys(mappingsConfigurationSchema.type.props);
const sourceConfigurationSchemaKeys = Object.keys(
  mappingsConfigurationSchema.type.props._source.type.props
);

// Validate the schema
const result = mappingsConfigurationSchema.decode(mappingsConfiguration);
const isSchemavalid = isLeft(result) === false;

// Manually validate if there are unknown properties
const unknownConfigurationParameters = Object.keys(mappingsConfiguration).filter(
  key => mappingsConfigurationSchemaKeys.includes(key) === false
);

const unknownSourceConfigurationParameters =
  mappingsConfiguration._source !== undefined
    ? Object.keys(mappingsConfiguration._source).filter(
        key => sourceConfigurationSchemaKeys.includes(key) === false
      )
    : [];

// Check each unknown property array one by one
if (unknownConfigurationParameters.length > 0) {
  // .... some logic
}

if (unknownSourceConfigurationParameters.length > 0) {
  // ... some logic
}

If we compare with avj, this is how we would do it (if this looks like ES mappings declaration... it is pure coincidence ! 😄 )

import Avj from 'ajv';
const ajv = new Ajv();

// Declare a schema
const schema = {
  properties: {
    date_detection: { type: 'boolean' },
    numeric_detection: { type: 'boolean' },
    _source: {
      type: 'object',
      properties: {
        enabled: { type: 'boolean' },
        includes: { type: 'array', items: { type: 'string' } },
        excludes: { type: 'array', items: { type: 'string' } },
      },
      additionalProperties: false, // Don't allow unknown
    },
    // ...rest of schema
  },
  additionalProperties: false, // Don't allow unknown
};

// Validate
const valid = ajv.validate(schema, data);

// Grab a coffee as we are already done :)
if (!valid) {
  // ... some logic
}

avj goes much further when it comes to schema validation, I let you look at its features. The strength of io-ts lies in the fact that we can retrieve Typescript Types out of its schema, but in practice, we don't need that feature.

We do need a robust validation schema declaration (like Joi gave us) that runs in the browser and allows us to validate our JSON simply.

cc @cjcenizal @jloleysens

@sebelga sebelga added enhancement New value added to drive a business result Feature:Mappings Editor Index mappings editor UI Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more labels Jan 29, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/es-ui (Team:Elasticsearch UI)

@jloleysens
Copy link
Contributor

jloleysens commented Jan 29, 2020

@sebelga Thanks for creating this issue for discussion! I really appreciate it 🙂. Some of my thoughts:

  1. I totally agree that io-ts is a more low-level library (helper libraries have been built on top of it like: https://github.com/gcanti/io-ts-types to provide more high level types). In that sense, it also doesn't ship with a ton of configurability. With io-ts one has atoms with no higher level settings like include/exclude types. Also I'm not a fan of their documentation.

  2. I actually think this may be a bug and have opened an issue against io-ts about this particular behaviour (allowing unknown properties in t.types or t.partial). io-ts claims to implement runtime types based on typescript. If const a: { b: number } = { b: 1, c: 2 } is invalid TS it should be invalid at runtime too. [UPDATE] It appears as though the current behaviour in io-ts is intentional since typescript implements structural type checks. There is an ongoing discussion at io-ts about how best to handle this.

  3. Are there any other behaviours we may want out of a schema validation library that you have identified? Besides disallowing unknown properties at runtime.

  4. What is the level of urgency here?

Overall I think we should go with the tool that doesn't incur too much tech debt in using it while not hampering productivity as you've illustrated. If ajv gives us 1, 2 and 3 I think we should add it as a Kibana dependency for sure and if 4 is high then sooner rather than later :).

[UPDATE]

One other thing to consider is that we do want to keep the list of dependencies as short/secure/clean as possible so auditing the dependencies of ajv would also be a good step to take!

@sebelga
Copy link
Contributor Author

sebelga commented Jan 31, 2020

With io-ts one has atoms with no higher-level settings like include/exclude types. Also, I'm not a fan of their documentation.

Yes indeed io-ts is very low level and it seems that apart from defining types and interfaces it does not go further.
If we ever need to validate that an array is not empty, a number is smaller than a threshold, a string does not start with a character, we would need to manually write those validations, which IMO should be the role of a validation lib.

About documentation: it is indeed very complex and poorly written for people not so familiar with advanced Typescript concepts and functional programming. We finally "get" it more by trial and error than understanding the docs.

Are there any other behaviors we may want out of a schema validation library

See my answer above but I could think of many. Like I said we not only need to validate that a value is a certain type, we should be able to validate that the value is also valid according to our business rules. My opinion is that we should have a tool that does not limit ourselves. Looking at AVJ doc it seems that string can be validated as ipv4, ipv6, uri, time ...

Also, I am proposing ajv, but I am opened to hear about other validation libraries.

we do want to keep the list of dependencies as short/secure/clean as possible

Great point, I also think we should keep the list of deps short. This is why the first thing I asked you when the Joi issue came up is: what is the recommendation from the Kibana op team? In the sense, what is the alternative (lib available in the repo) that we should be using from now on instead of Joi.
I agree that, if we find ajv valuable for our use cases, that it works both in the browser and in the server, that the bundle size is correct, that it is fast, that it covers a wide range of validation scenarios, then we should advocate for it to the Kibana team and have all the teams use that library instead of each team finding the "best schema validation lib" 😊

To answer your question, it has 4 dependencies

  • fast-deep-equal (0 deps)
  • fast-json-stable-stringify (0 deps)
  • json-schema-traverse (0 deps)
  • uri-js (1 dep)
    • punycode (0 deps)

What is the level of urgency here?

None as we have a fix (hack 😊 ) for the mappings editor validation in place. But as I said above, if we do find value in ajv and want to start using it, then we should try to make it the default (recommended) lib for schema validation in the Kibana repo.

@rudolf
Copy link
Contributor

rudolf commented Sep 24, 2020

I think the closest we have to a default validation library is @kbn/config-schema. We are aware that it has several short-comings like no documentation, no browser support and bad performance.

There seems to be better typescript support coming for AJV ajv-validator/ajv#736 (comment)

One benefit of io-ts is that so many teams are already using it, so before we would make a recommendation for a new default we also need to weight up the effort to convert any existing validation vs the potential benefits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Mappings Editor Index mappings editor UI Team:Kibana Management Dev Tools, Index Management, Upgrade Assistant, ILM, Ingest Node Pipelines, and more
Projects
None yet
Development

No branches or pull requests

4 participants