-
Notifications
You must be signed in to change notification settings - Fork 354
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to type coercion #384
Improvements to type coercion #384
Conversation
@@ -185,6 +185,10 @@ protected function validateType(&$value, $type, $coerce = false) | |||
} | |||
|
|||
if ('array' === $type) { | |||
if ($coerce) { | |||
$value = $this->toArray($value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, but I notice that ajv only does this conversion if coerceTypes
is set to "array". Should this be controlled by another flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, good point. I'm happy either way - would you like a flag?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly don't know. We can try without for a while, and see if the rationale for having one ever emerges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion - let's do that then. And if someone asks for it, then we can add it at that point.
I don't supposed you know whether @epoberezkin has stated his rationale for doing it that way in ajv?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I think this is the issue that spawned the feature: ajv-validator/ajv#158
They discuss it a bit, then come up with the coerceTypes:'array'
idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm... will have to think on this a bit. At a minimum I should write some tests to check that post-coercion validation is happening properly.
@@ -255,10 +269,27 @@ protected function toNumber($value) | |||
|
|||
protected function toInteger($value) | |||
{ | |||
if (is_numeric($value) && (int) $value == $value) { | |||
$numberValue = $this->toNumber($value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Little risky, isn't it?
echo toInteger("5f"); // 5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ooh, good catch - this is why we have code reviews! Thanks :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be sorted now.
return (int) $value; | ||
} | ||
if ($this->getTypeCheck()->isArray($value) && count($value) === 1) { | ||
reset($value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reset
returns the first element in the array, so you can skip a step and pass its results directly on line 272
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - thanks :-).
protected function toInteger($value) | ||
{ | ||
if (is_numeric($value) && (int) $value == $value) { | ||
return (int) $value; // cast to number | ||
$numberValue = $this->toNumber($value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does the call to toNumber
get you?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All the other type coersion cases. It means no copy pasta, because other than integer-ness, the two methods are identical.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there anything actually functionally different? Because now we're calling is_numeric
twice whereas before it was only called once
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes - the second is_numeric
call catches cases that fall through toNumber
without being successfully coerced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's what I mean, though--if you put the code back the way it was then there is only one call to is_numeric
. So I guess I'm still not clear on what is functionally different about the new code. Are additional values now being coerced to integer (eg. boolean) with this code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The new code also coerces from bool, null, and array, via the call to toNumber
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically the logic is to apply all the coercion rules for number, and then see if the result can be further coerced to an int. If it is, then return the cast int - if not, return the original value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
okie-doke
* | ||
* @param mixed $value | ||
* | ||
* @return array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the return type should be array|mixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the version of PHPDoc in this project OK with that? I've had issues with types like that in the past.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(If it's not OK, it should be mixed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Never mind, apparently this project doesn't use PHPDoc. I'll change to array|mixed
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just mixed
is fine, too. I suggested array|mixed
to emphasize that array
is primarily what we're interested in
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like your original suggestion better. If we're not using PHPdoc, then there's no reason to avoid it IMO. If such a reason crops up later, it's a pretty quick / easy change to downgrade the return type to mixed
.
584d05f
to
c9e1276
Compare
@shmax I think this PR now contains everything we intended it to contain. Is there anything you'd like me to add / change / remove / hide-in-the-corner-and-cry-from-shame-about? |
Wow, very impressive, as always! I'll take another look tomorrow when I'm fresh but for now I say LGTM |
Thanks :-). |
Hmm, just thought of one other thing - I need to ensure that the double-coercion for array cases is covered. |
Double-coercion tests done. |
c67f9fd
to
42411d8
Compare
@shmax How's this? |
README.md
Outdated
`CHECK_MODE_EARLY_COERCE` has no effect unless used in combination with `CHECK_MODE_COERCE_TYPES`. If | ||
enabled, type coercion will occur as soon as possible, even if the value may already be of another valid | ||
type. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you maybe expand on this a bit to make it more clear? I know what you're talking about, but I'm not sure that someone else coming to the project for the first time would.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Certainly.
tests/Constraints/CoerciveTest.php
Outdated
// toType | ||
'string' => array( | ||
// fromType fromValue toValue valid Test Number | ||
array('string', '"string"', 'string', true), // #0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wow, I love the new grid, much easier to read!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually find it confusing when a code sample uses values that mimic or mirror the name of the type they are meant to be demonstrating. Instead of 'string', can we use something else?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure :-).
tests/Constraints/CoerciveTest.php
Outdated
if ($valid) { | ||
$this->assertTrue( | ||
$validator->isValid(), | ||
'Validation failed: ' . json_encode($validator->getErrors(), \JSON_PRETTY_PRINT) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
tests/Constraints/CoerciveTest.php
Outdated
*/ | ||
public function testInvalidCoerceCasesUsingAssoc($input, $schema, $errors = array()) | ||
/** @dataProvider dataCoerceCases **/ | ||
public function testCoerceCases($schema, $data, $startType, $endType, $endValue, $valid, $early = false, $assoc = false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the 7th argument was just something like $flags
? Then instead of passing in true
you could pass in CHECK_MODE_EARLY_COERCE
, which would be a little easier to read, and you don't have to keep adding more arguments to this method signature as more flags are added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea - will do.
tests/Constraints/CoerciveTest.php
Outdated
array('boolean', 'true', true, true), // #17 | ||
array('NULL', 'null', false, true), // #18 | ||
array('array', '["true"]', true, true), // #19 | ||
array('object', '{"a":"b"}', null, false), // #20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
extra space?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup - well spotted!
tests/Constraints/CoerciveTest.php
Outdated
array('integer', '45', '45', true), // #1 | ||
array('boolean', 'true', 'true', true), // #2 | ||
array('boolean', 'false', 'false', true), // #3 | ||
array('NULL', 'null', '', true), // #4 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's with the counting numbers in the comments? They look very nice, but it seems like the kind of thing that would be a pain to maintain going forward. What are they for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They are the test number, starting from zero. PHPUnit reports those numbers when anything fails - it makes tracking down a failing test case much faster / easier when you don't have to manually count them all every time, particularly when there are this many tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, that's what I was originally thinking about when I suggested breaking things up into smaller tests. testEarlyCoerce
, testArray
, testNoCoerceNeeded
, etc. Maybe not for the vanilla stuff, but at least the complex ones. This is okay, too, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you prefer them as separate methods? I like them better all in one place like this, because it's easier to compare them - but I can certainly break to one method per case if you'd rather.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Up to you. I can live with the numbers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then I'd like to keep them the way they are please - I find it much easier to debug this way :-).
tests/Constraints/CoerciveTest.php
Outdated
$this->assertTrue(gettype($value->negativeInteger) == 'integer'); | ||
$this->assertTrue(gettype($value->boolean) == 'boolean'); | ||
// #38 check post-coercion validation (to array) | ||
$tests[] = array( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's okay to break out these more complicated ones like this, but since even the complex ones have only a single property, can they be setup before you iterate over them on 83?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They can. The reason I did it this way is because complex tests are more likely to be added than simple ones, and adding things to the beginning or midway through requires manually renumbering all the test cases - whereas adding something to the end is easy, and doesn't require renumbering anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't suggesting adding them to the big grid, only adding them after the grid but before the loop. That way you don't have to do the brain work of trying to figure out what the json string will look like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No problem - done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I get it--you can't handle the complex tests the same way you do the others because they have arrays of types, so you can't slot them in the same way as the others.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or at least a few of them do. It seems like some of them could still be handled the same way as the ones in the big grid, eg:
before
// #39 check post-coercion validation (from array)
$tests[] = array(
'{"properties":{"propertyOne":{"type":"number"}}}',
'{"propertyOne":["ABC"]}',
'array', null, null, false
);
after
// #39 check post-coercion validation (from array)
$tests['number'][] = array(
'array',
["ABC"]',
null, null, false
);
I'm sure I have all that in the wrong order, but hopefully you know what I mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point - have moved test 39 to the grid. I don't think any of the others can move though - if you spot any, please let me know, but I think they all have reasons for being where they are.
LGTM Fantastic work! Take a victory lap |
Thanks :-). Really appreciate the work you've done reviewing all this too, and the various suggestions you've made - the result is better because of that. |
ab501e8
to
7df164c
Compare
Rebased. |
@erayd conflict? |
7df164c
to
98b6516
Compare
@bighappyface I think we overlapped, and I rebased before you merged the conflict. I have rebased it again. |
98b6516
to
e64fd9e
Compare
Rebased. |
Once more, please :) |
* Add all remaining coercion cases from ajv matrix * Rewrite the coercion tests to tidy things up a bit
If set, falls back to the old behavior of coercing to the first compatible type, regardless of whether another already-valid type might be available.
* Fix whitespace * Turn $early into $extraFlags * Change "string" to "ABC" in string test * Update README.md description of CHECK_MODE_EARLY_COERCE
e64fd9e
to
a21741d
Compare
Done. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
* Improve performance - don't loop over everything if already valid * Don't coerce already-valid types (fixes jsonrainbow#379) * Add remaining coercion cases & rewrite tests * Add all remaining coercion cases from ajv matrix * Rewrite the coercion tests to tidy things up a bit * Add CHECK_MODE_EARLY_COERCE If set, falls back to the old behavior of coercing to the first compatible type, regardless of whether another already-valid type might be available. * Add multiple-type test that requires coercion * \JSON_PRETTY_PRINT doesn't exist in PHP-5.3, so work around this * Various PR cleanup stuff * Fix whitespace * Turn $early into $extraFlags * Change "string" to "ABC" in string test * Update README.md description of CHECK_MODE_EARLY_COERCE * Move loop after complex tests definition * Move test jsonrainbow#39 to grid jsonrainbow#15
What
CHECK_MODE_EARLY_COERCE
to enable previous coercion behaviorWhy
See discussion in #379 regarding coercion of already-valid types and
CHECK_MODE_EARLY_COERCE
. Other items are performance / bugfix, or are to improve testing.