Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHPLIB-1250 Split encoders and fix psalm issues #46

Merged
merged 17 commits into from
Jan 29, 2024

Conversation

alcaeus
Copy link
Member

@alcaeus alcaeus commented Jan 23, 2024

PHPLIB-1250

This PR splits the main BuilderEncoder into multiple files. This allows for more isolation of separate code (which I'm sure can be improved further), but also allows for easy introduction of custom codecs (currently not tested yet).

In addition, this PR cleans up a number of psalm issues and adds other issues to the baseline. I'll comment on those as necessary.

<files psalm-version="5.15.0@5c774aca4746caf3d239d9c8cadb9f882ca29352">
<files psalm-version="5.20.0@3f284e96c9d9be6fe6b15c79416e1d1903dcfef4">
<file src="src/Builder/Encoder/AbstractExpressionEncoder.php">
<MixedAssignment>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A lot of MixedAssignment calls are the result of incomplete type annotations throughout the generated files.

</MixedAssignment>
</file>
<file src="src/Builder/Encoder/FieldPathEncoder.php">
<NoInterfaceProperties>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixing this involves creating a new generator for field path classes so we can add a getName interface methods. It irks me that public properties cannot be part of the interface contract, but here we are.

<code>$val</code>
<code>$val</code>
</MixedAssignment>
<UndefinedConstant>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the interface property above, this undefined constant stems from us checking for an interface, then accessing a class constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Object Interfaces: Constants, PHP 8.1 allows overriding constants. Is there any reason Psalm cannot support this?

Or does it stem from OperatorInterface not actually defining the constant, and it just being a convention in this library for implementing classes to do so? If so, perhaps we need to incorporate a new method for the encoding directive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed this entry and added a constant to OperatorInterface.

</MixedAssignment>
</file>
<file src="src/Builder/Projection/ElemMatchOperator.php">
<MixedArgumentTypeCoercion>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MixedArgumentTypeCoercion errors are due to the query expression not using the correct array element types but instead using the array type. I tried to fix this, but the code generator doesn't accept a type like array<int|string>. This will have to be looked at separately.

<PropertyTypeCoercion>
<code>$expression</code>
</PropertyTypeCoercion>
<TooManyTemplateParams>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use stdClass<foo|bar> to indicate an object with an unknown number of properties of a specific type. However, stdClass and object cannot be templated this way, which is why we're seeing this error. The stdClass{foo: int} syntax doesn't work for us here as we don't know the keys of the object, but only the expected value types. I've asked in the Symfony Slack to get more info about this and will update once I know more.

</TooManyTemplateParams>
</file>
<file src="src/Builder/Type/OutputWindow.php">
<DocblockTypeContradiction>
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This and the RedundantConditionGivenDocblockType checks are psalm telling us that the condition is impossible due to the docblock. However, since we're being extra cautious here and any user not using psalm is able to send any types they want, I think it's fair to keep this around unless we want to accept potential fatal errors when users abuse this.

return $value instanceof Pipeline;
}

public function encode(mixed $value): stdClass|array|string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This returns always a list.

Suggested change
public function encode(mixed $value): stdClass|array|string
public function encode(mixed $value): array

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, would list be preferable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed the return type to array, and the resulting type to list<array|stdClass|string> as list itself wasn't sufficient. Since we don't have BuilderEncoder::encode typed any more strictly, this is the best we can do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correction: it's list<mixed>, as encodeIfSupported is used throughout the encoders, which results in an effective return type of mixed whenever we defer to the BuilderEncoder class.


public function encode(mixed $value): stdClass|array|string
{
if (! $this->canEncode($value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a method encodeIfSupported to avoid repeating this 3 lines in all encoders.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't encodeIfSupported() already provided via the EncodeIfSupported trait above? The trait implementation utilizes the canEncode() and encode() methods.

I understand that doesn't address the repetition of checking canEncode() in each encode() method, though.

Comment on lines 63 to 65
$encoder = $this->getEncoderFor($value);

return $result;
return $encoder !== null && $encoder->canEncode($value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is syntax for that:

return (bool) $this->getEncoderFor($value)?->canEncode($value);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, along with a later check.

$result->{$key} = $this->encodeIfSupported($value);
}
if (! $encoder || ! $encoder->canEncode($value)) {
throw UnsupportedValueException::invalidEncodableValue($value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The $encoder->canEncode($value) part is already checked inside $encoder->encode($value).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, however there is no guarantee that somebody will never call encode with an invalid value, as the contract does not require calling canEncode for a value beforehand.

We could solve this with the introduction of a null type for the $value argument in the Encoder::encode definition, but that would require us to always accept null values in any encoder, which might still necessitate calling canEncode. void as PHP's true bottom type is unfortunately not available as an argument type, so there's no proper typing way around this.

src/Builder/Type/QueryObject.php Show resolved Hide resolved
return $value instanceof Pipeline;
}

public function encode(mixed $value): stdClass|array|string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, would list be preferable?


public function encode(mixed $value): stdClass|array|string
{
if (! $this->canEncode($value)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't encodeIfSupported() already provided via the EncodeIfSupported trait above? The trait implementation utilizes the canEncode() and encode() methods.

I understand that doesn't address the repetition of checking canEncode() in each encode() method, though.

<code>$val</code>
<code>$val</code>
</MixedAssignment>
<UndefinedConstant>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to Object Interfaces: Constants, PHP 8.1 allows overriding constants. Is there any reason Psalm cannot support this?

Or does it stem from OperatorInterface not actually defining the constant, and it just being a convention in this library for implementing classes to do so? If so, perhaps we need to incorporate a new method for the encoding directive?

src/Builder/BuilderEncoder.php Outdated Show resolved Hide resolved
src/Builder/Encoder/OperatorEncoder.php Outdated Show resolved Hide resolved
src/Builder/Encoder/OperatorEncoder.php Show resolved Hide resolved
Comment on lines +156 to +159
$object = new stdClass();
$object->{$value->getOperator()} = $result;

return $object;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want a one-liner for this?

Suggested change
$object = new stdClass();
$object->{$value->getOperator()} = $result;
return $object;
return (object) [$value->getOperator() => $result];

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to microbench both implementations.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Deferring to @GromNaN for this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since I was already microbenching array_key_exists, I also checked this. There's no significant performance difference (30 ms when testing 1M object creations), so there's no performance benefit in using either.

src/Builder/Encoder/QueryEncoder.php Show resolved Hide resolved
throw UnsupportedValueException::invalidEncodableValue($value);
}

return '$$' . $value->name;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In FieldPathEncoder you have the following:

// TODO: needs method because of interface

Is this also affected? If so, can we cover both in a single JIRA ticket?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alcaeus: ☝️

Copy link
Member Author

@alcaeus alcaeus Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tracked in PHPLIB-1379.

@alcaeus alcaeus force-pushed the phplib-1250-split-encoders branch 2 times, most recently from b5dba67 to f65ba10 Compare January 26, 2024 08:46
Copy link
Member

@jmikola jmikola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but not my outstanding question about TODO comments.

@@ -9,5 +9,8 @@
*/
interface OperatorInterface
{
/** To be overridden by implementing classes */
public const ENCODE = Encode::Single;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted that you're defaulting to Single here instead of introducing an invalid case. My thinking was the invalid case would make it easier to catch mistakes when failing to override this, since Single would just happen to work (at least for encoding) provided the class only had a single property.

Copy link
Member

@GromNaN GromNaN Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can add a Undefined case to the enum. We could also use null, but the const may be typed one day.

Copy link
Member

@jmikola jmikola Jan 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm personally a fan of Undefined but won't push for it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Introduced Undefined value to the enum. Agree with not abusing null for this to keep the constant strictly typed.

@alcaeus alcaeus merged commit 397ff5d into mongodb:0.1 Jan 29, 2024
5 checks passed
@alcaeus alcaeus deleted the phplib-1250-split-encoders branch January 29, 2024 07:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants