Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plugin-System for Validation of formats? #266

Closed
steffkes opened this issue May 17, 2016 · 15 comments
Closed

Plugin-System for Validation of formats? #266

steffkes opened this issue May 17, 2016 · 15 comments

Comments

@steffkes
Copy link
Contributor

i'm using this library again for one of my projects, never had to extend it in any way since the included features were always enough - until this time ;>

there is already a part of FormatConstraint that validates format=phone but it only does a very, very simple check on US phone numbers. i did need to perform a check on a larger range including internal numbers. libphonenumber-for-php is a perfect fit for it, integrating that into json-schema and extending the FormatConstraint feels a bit ugly.

just for reference, one could or would probably:

class FormatConstraint extends \JsonSchema\Constraints\FormatConstraint
{
  public function check($element, $schema = null, $path = null, $i = null)
  {
    if($schema->format && $schema->format === 'phone') {
      try {
        PhoneNumberUtil::getInstance()->parse($element, NULL);
      } catch(NumberParseException $e) {
        $this->addError(
          $path,
          sprintf('Invalid phone number: %s', $e->getMessage()),
          'format',
          ['format' => $schema->format]
        );
      }
    }
  }

  protected function validatePhone($phoneNumber) {
    return true;
  }
}

since we extend FormatConstraint we'd have to give validatePhone a pass since it will be called from the original FormatConstraint#check method which is executed anyway since we have to ensure that all the other/existing format constraints still work.

the other thing is, since i'd like to include additional information in the response i need access to the exception - not a thing that all the existing validate* methods do right now. obviously a personal choice, but nevertheless.

before i start digging in such a plugin-system, i'm curious WDYT? is it even worth it? am i the only one that needed something like this? is it probably the only extension one could ever want?

let me know!

@mirfilip
Copy link
Contributor

mirfilip commented May 17, 2016

@steffkes Since 2.0.0 there is a possibility to override constraints by setting your own FormatConstraint. It's not exactly what you suggest, but theoretically it allows anyone to extend FormatConstraint and change check method so that it uses your new fancy way of phone format validation and passes all the other format types to original implementation.

If you ask me, I think this big switch in FormatConstraint should be configurable in way similar to Constraints/Factory, so that you can either add your custom format support or overwrite e.g. phone format by creating your own FormatConstraints\PhoneConstraint and registering it. The whole notion of format constraints is mentioned in json-schema standard and we need to verify compliance first.

@steffkes
Copy link
Contributor Author

@mirfilip probably should have been clearer in my initial description - that is exactly what i did, hence the sample FormatConstraint i've posted above.

it's kinda okay, but especially for existing formats you are going to overwrite you have to ensure that they pass the initial implementation since it's still executed - at least in my case when calling parent::check because i do need the existing checks on date, email and so on.

given that phone number validation is not necessarily a crazy custom task for each project .. i think it's a bit cumbersome when you have to come up with something like this, isn't it?

basically it's a hash, mapping various formats to their validation function - could be more flexible than what we have right now. but that's quick a thought i had when i was implementing my own constraints - having doubts that i'm the only one ...

so it's more a proposal / question about a proposal, to come up with something a bit more flexible for (additional) formats.

@mirfilip
Copy link
Contributor

mirfilip commented May 17, 2016

@steffkes I double read your initial post now. Correct me if I'm wrong, your proposal is the same thing I described in my second paragraph - you create your own FormatConstraints\PhoneConstraint and register it under FormatConstraint::phone (pseudocode here). I support the idea very much.

Just to clarify what we stand on now. Looking at FormatConstraint it's obvious some formats are very naive. Filtering emails, uris, hostnames are no as easy as using filter_var (unfortunately). Of course, there are some low hanging fruits (color). Others needs a dedicated tooling. Otherwise, json-schema would have to depend on external libraries like mentioned libphonenumber-for-php. Some of them would even depend on C libraries underneath (think intl and co). As most people will probably not need all of them, it's good to have a dedicated validators laying beside so that we could suggest them in composer.

I think I could play around with format validation - we need it for draft4 compliance anyway. @bighappyface @jojo1981 thoughts?

@bighappyface
Copy link
Collaborator

I think the best compromise to achieve more flexible customization, as well as an overall design improvement, is to refactor FormatConstraint to be composed of format validation classes that have a common interface and can be set at runtime.

That being said, I am not sure that I see an issue. The facility to define custom constraints is available but from what I am reading it looks like there is even more customization desired, which IMO is beyond the scope of this package as @mirfilip described. This JSON schema package can't do/validate everything and we can't make code changes every time a consumer wants something very custom (no offense @steffkes).

@steffkes
Copy link
Contributor Author

@mirfilip i should have read your message once again before replying, you're spot on. pretty sure i've missed your second paragraph, my bad.

@bighappyface i should definitely try to right better issue descriptions - having an additional dependency for a part of the library that is probably not used that often is obviously a bad choice. making it optional only would leave as with runtime checks, which isn't the best design either.

no offense taken David, i was able to implement whatever crazy requirement we had - no reason to complain. the idea was more to propose an idea for further development. if we could refactor the FormatConstraint Class to allow for a pluggable system that is exactly what we would do .. move out all the crazy (probably custom) validation w/o a scratch:

feeling the need for full fledged phone validation, go with json-schema-format-phone (hopefully someone will come up with a better name than that) ..

creditcard numbers, uuid's, vat numbers, proper base64 strings .. are just a bunch of things that come to mind. and i absolutely don't expect one library (neither this one nor an other one) to solve all of them in one shot.

to say it again - it's not about what i just did to get it working. it's more about a generic/general idea on how to make this library more flexible in terms of validating various formats. right now, i think it's a bit more complicated than what's needed.

i hope that clarified a few things - otherwise just say so :)

@mirfilip
Copy link
Contributor

mirfilip commented May 18, 2016

@bighappyface

I think the best compromise to achieve more flexible customization, as well as an overall design improvement, is to refactor FormatConstraint to be composed of format validation classes that have a common interface and can be set at runtime.

That's sums up the longish discussion. Please have a look at ajv json-schema validator. That's exactly the @steffkes idea and the public interface is pretty much nailed.

Let me know if you like the idea, I will tackle the problem during the weekend.

btw. @bighappyface what do you think about adding "feature request" label?

@bighappyface
Copy link
Collaborator

@mirfilip label created.

I am all in favor of a more extensible system for custom constraints or implementations.

@steffkes
Copy link
Contributor Author

steffkes commented Jun 3, 2016

@mirfilip just curious, did you already take a stab? i'd like to help out if & where i can. i've already added two additional format constraints (language- & country-codes based on ISO-Codes) and the list keeps growing .. ;>

@narcoticfresh
Copy link
Contributor

narcoticfresh commented Jun 8, 2016

the idea of (quoting from @bighappyface)

to refactor FormatConstraint to be composed of format validation classes that have a common interface and can be set at runtime.

is certainly a good one. but it's often seen with those things, that quite some performance gets lost in 'discovery' of the classes that want to validate and what they want to validate. Typically, somewhere all the schema arrays are passed to a bunch of classes where they can either react or not.

So, if the simple switch/case gets rewritten to that, it would mean to put all existing format validators (which now are in 1 class), into several classes and then iterate them, passing the $schema to every single one. Then include a register/override facitility.

Needless to say, that's nice - but it's slower by factors (even if currently low ones) then the current simple approach.

Maybe keep the current stuff as-is, but allow to just register new formats with classes? We can still make the "core" type validations simple and let them be overridable by checking it in the current construct..

My point is to keep the current "core" validation as fast as possible..

@mirfilip @steffkes
The purpose of a schema gets undermined if you create a format myownformat, as nobody consuming the schema can validate it. it's then only you holding the correct validation logic somewhere in your code. the idea of a json schema is to be universally usable. Now, of course, that's theory. But whatever can be done with a regex pattern, should be specified in the schema as a regex pattern ;-) And most things can.. ;-)

@steffkes
Copy link
Contributor Author

The purpose of a schema gets undermined if you create a format myownformat, as nobody consuming the schema can validate it. it's then only you holding the correct validation logic somewhere in your code. the idea of a json schema is to be universally usable. Now, of course, that's theory. But whatever can be done with a regex pattern, should be specified in the schema as a regex pattern ;-) And most things can.. ;-)

While i absolute agree on that - it's sometimes just not feasible. and there is still the possibility to add some kind of description to your schema which provides a hint about what is required.

Maybe keep the current stuff as-is, but allow to just register new formats with classes? We can still make the "core" type validations simple and let them be overridable by checking it in the current construct..

for me, that's just another way to phrase it. i do not really care (or rather need) to make everything super flexible - my pain came from an existing phone-type validation which was harder to overwrite than i initially expected.

Needless to say, that's nice - but it's slower by factors (even if currently low ones) then the current simple approach.

well, if we're going to argue about something like that .. i'm not sure we're going places. if that would really be the reason why we're not doing it, then we should have a serious conversation.

TL;DR: i'd be okay if we decide to remove those formats (namely regex, color, style and phone) because as soon as you try to overwrite one of those - you're almost out of luck. as there is no real possibility to actually overwrite only them w/o having to re-implement the needed rest.

@narcoticfresh
Copy link
Contributor

@steffkes

i totally agree with ur statements.. i'd also like to have more flexibility and I'm currently struggling with stuff in FormatConstraint as well..

well, if we're going to argue about something like that .. i'm not sure we're going places. if that would really be the reason why we're not doing it, then we should have a serious conversation.

Again +1 - I was just voting for an approach that doesn't loose performance out of sight. This library mostly ends up in complex apps and is expected to be snappy - I hope that doesn't change ;-)

Maybe do something using symfony/event-dispatcher, kinda what jms serializer does with it's events?

@steffkes
Copy link
Contributor Author

so then, what about the following - at least as short term goal: we refactor the rather large switch/case statement. we do so by moving regex, color, style and phone into a Factory-like map. That would avoid any breaking changes (assuming we do it right) but still allow everyone to go with custom formats if needed. assuming that, what people actually want, is either overwrite one of those and/or adding more custom ones - but not trying to overwrite other format definitions like date-time, email and such. sounds okay to me, since those are described in their corresponding RFCs and therefore pretty obvious - compared to the others that are subjective to the current implementation/context.

for the long run, we can still refactor the really large statement - which doesn't seem needed to me anymore once we've done the refactoring just described above.

@steffkes
Copy link
Contributor Author

just a quick hack over at #280, to get it out of my head. pretty sure it does not work for the given set of requirements (php 5.3 and what not) but it should be possible to get the idea @narcoticfresh

@DannyvdSluijs
Copy link
Collaborator

@steffkes in an attempt to cleanup this repo we are trying to filter the issues and see which ones might be closed. Is it safe to assume this is a rather old feature request, which sadly was left unanswered, and can be closed? Feel free to close it yourself with some comments if helpful.

@steffkes
Copy link
Contributor Author

@DannyvdSluijs absolutely. I'm not even doing PHP anymore .. time is flying!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants