Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse types for validity, and smarter checking #120

Closed
wants to merge 28 commits into from

Conversation

james-cnz
Copy link
Contributor

This change parses PHPDoc types to check for validity, and also does (slightly) smarter checking between PHP types and PHPDoc types.

Type checking is based loosely on the PHPStan rule that PHPDoc types should be the same as or narrower than than PHP types (since PHPDoc types are more comprehensive, so can specify types more precisely). It recognises this as correct:

    /**
     * Test function
     * @param stdClass&object{id:int} $x
     */
    function test_fun(stdClass $x): void {
    }

And this as not:

    /**
     * Test function
     * @param stdClass|object{id:int} $x
     */
    function test_fun(stdClass $x): void {
    }

It's not very smart. Although it can parse object details, it doesn't understand them, and just treats all objects as plain objects. It's no PHPStan, but it's a step up from current type checking.

I tried to figure out the failing CI check, but don't understand what it means.

@james-cnz
Copy link
Contributor Author

james-cnz commented Aug 27, 2023

I think it's pretty complete now. It parses all standard PHP and PHPDoc types, and additionally most PHPStan types, aside from conditional return types, which would require look ahead, and global constants and a weird rule about collections, which would require looking up information about classes. It's not very thorough, it'll pass a bunch of stuff that's not valid, but, aside from the previously mentioned omissions, it should give fewer false positives than the checker currently does.
EDIT: There'll be false positives when comparing types with key-of, value-of, or class constants, because I just assume these to be wide, but the checker as it is will give false positives for these anyway, so it's no worse. I guess I'd have to figure out handling an unknown type to avoid this, but I think I'll leave it there for now.
EDIT: Actually, since these are only PHPDoc types, not native PHP types, I could go narrow instead of wide, and set them to "never". That would prevent type checking them, and avoid these false positives. What do you think?

@james-cnz
Copy link
Contributor Author

I did a diff of the output between the current checker and my change on the course directory, and looked at the first few. There were a lot of cases of parameters declared in PHPDoc as nullable, and in PHP itself not as nullable, but given a null default value. This is accepted by the checker currently, but not by my change. Apparently it's valid, but not recommended. Should it be accepted?

@james-cnz
Copy link
Contributor Author

Sorry for the spam, I was keen to get this out the door, but it wasn't really ready. To see how well it's working, I've categorised the differences between what the current checker reports and my change as it is now, when run on the Moodle code base (current master). I've used PHPStan to decide what is and isn't an error. It looks like there's still a few things to fix. Handling non-DNF types is beyond me, though. That's going to have to go in the too-hard basket. I'd like to get feedback on whether the code smell checks are still wanted, and if so, which ones (capitalisation mismatch in type would be difficult for me to add, though). If so, I think it would be good if these could be reported as warnings rather than errors, but I'm not sure how to go about that, any ideas?

False Positives No Longer Reported
PHPDoc advanced type: 102

  • array<>: 62
  • array-key: 21
  • callable: 15
  • boolean (alias of bool): 13
  • positive-int: 7
  • numeric literal: 4
  • array{}: 4
  • integer (alias of int): 2
  • Collection<>: 1
  • iterable<>: 1
  • (These don't add up, because sometimes multiple advanced types are used.)

PHPDoc type includes pass by reference &: 68
Comment after native parameter: 2
Types match in a non-trivial way: 2
Total: 174

Code Smells No Longer Reported
Native type is explicitly nullable, PHPDoc type is not: 139
Native argument splat ... not included in PHPDoc type: 49
Capitalisation mismatch in type: 47
PHPDoc type is specified as "type": 4
PHPDoc argument specifies default value (i.e. var name is immediately followed by = ): 3
No space between PHPDoc type and var: 1
Total: 243

Errors No Longer Reported
Capitalisation mismatch in var name: 4
Int considered subtype of float: 1
Total: 5

Errors Newly Reported
Missing PHPDoc type: 82
Malformed PHPDoc type: 43 (Actually less, see below.)
Variable name specified before PHPDoc type: 18
PHPDoc type is nullable, native type is not: 9
Total: 152

Code Smells Newly Reported
Array details specified in square brackets: 3
(Actually more, I probably miscategorised many as malformed PHPDoc type, because I didn't reallise this wasn't an error until quite some way through categorisation.)

False Positives Newly Reported
Non-DNF PHPDoc type: 7

  • @var array{(callable(callable(RequestInterface, array): PromiseInterface): callable), (string|null)}[]
  • @param (array|object)[]
  • @return (bool|float|int|string)[]
  • @param bool|float|int|string|(bool|float|int|string)[]
  • @var (bool|float|int|string)[]
  • @param (object|array|null)[]
  • @return (question_classified_response|array)[]

@james-cnz
Copy link
Contributor Author

james-cnz commented Sep 6, 2023

I've handled non-DNF types by checking that they parse correctly, and just giving up on the type checking if it gets too complicated. I think it's good to go now. Here's an analysis of the differences between the current checker and my change, as it is now:

False Positives No Longer Reported
Advanced types: 102

  • array<>: 60
  • array-key: 20
  • callable(): 14
  • boolean (alias of bool): 13
  • positive-int: 7
  • integer literal: 4
  • array{}: 4
  • integer (alias of int): 3
  • Collection<>: 1
  • iterable<>: 1
  • (Type|Type)[]: 1
  • (These don't add up, because sometimes multiple advanced types are used.)

PHPDoc type includes pass by reference &: 69
Comment in native argument list: 4
Types match in a non-trivial way: 2
Total: 177

Code Smells No Longer Reported
Var name ends in punctuation: 60

  • , 53
  • . 7

Case mismatch in type: 47
No space between type and var: 1
Likely malformed type: 5

  • type specified as "type": 4
  • @param array<array|Number> $args

Total: 113
I think most of these aren't indicitive of a mistake. Specifying the type as "type" or "Number" probably does indicate an error, but there could be a class named like this, so I'm not sure it should be reported. [EDIT: Apparently "number" is a valid type.]

Errors Newly Reported
Missing PHPDoc type (var name instead): 92
Malformed type: 33
Var name before PHPDoc type: 11
PHPDoc type is nullable, native type not: 9
@return CONTEXT_*
Total: 146
I can see what is meant by "CONTEXT_*"--a value of any global constant starting with "CONTEXT_", and it is valid to do this with class constants, but not with global ones, apparently.

Code Smells Newly Reported
Likely malformed type: 43

  • array(Type): 21
  • array[Type]: 12
  • array(): 4
  • Type/Type: 3
  • Type[Type]: 1
  • list(Type): 1
  • @var null@var bool

Technically I don't think these are errors--they're just a valid type with something else immediately after them, but it's indicitive of a mistake. These are caught by checking that there's a space (or punctuation ,;:.) after the type.

False Positives Newly Reported
@return Returns <code>true</code> on success and <code>false</code> on error
"Returns" is specified as the type to return here. Probably a mistake. However the parser interprets the following <code> to indicate that "Returns" is a Collection holding objects of type "code", then reports an error because there's no space after that.

@james-cnz
Copy link
Contributor Author

I've added in type checking for return values and class variables (previously only function parameters were type checked). To prevent too many new false positives, I've also added in scanning class hierarchies (within the current file), checking types aliases in use ... as ... directives, and knowledge of PHP predefined and SPL classes. This has kept new false positives down to 3. There are CI warnings due to a change in CI rules, but most don't relate to code I changed.

@andrewnicols
Copy link
Contributor

Hi @james-cnz ,

We are currently actively trying to wind down this plugin and project in favour of our existing moodlehq/moodle-cs PHP_CodeSniffer ruleset. Details of these changes can be found at moodlehq/moodle-cs#30.

This whole ruleset is the biggest blocker, but we're hoping to be able to use some off-the-shelf rules to replace it. I don't know what this means for this patch but it will have to change for the phpcs rulesets -- I'm just not sure of details yet.

Sorry to be the bearer of bad news,

Andrew

@james-cnz
Copy link
Contributor Author

Hi @andrewnicols,
OK, thanks for letting me know.

@james-cnz
Copy link
Contributor Author

Hi @andrewnicols,
If you've got time to check it out, I've done an initial port here: https://github.com/james-cnz/moodle-cs/tree/phpdoc-types-wip
It still needs some work, but the basics at least seem to be working.

@james-cnz
Copy link
Contributor Author

Replaced by moodlehq/moodle-cs#123

@james-cnz james-cnz closed this Mar 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants