Skip to content
This repository has been archived by the owner on Sep 20, 2021. It is now read-only.

Fix error handling for non-UTF-8 string in Lexer #93

Closed
wants to merge 3 commits into from

Conversation

unkind
Copy link

@unkind unkind commented Nov 7, 2018

So far Lexer fails into infinite recursion with enabled unicode.mode if input data is not properly encoded.

@unkind
Copy link
Author

unkind commented Nov 7, 2018

It looks like duplicate of #88, but this fix is more accurate.

@coveralls
Copy link

coveralls commented Nov 7, 2018

Coverage Status

Coverage remained the same at ?% when pulling b0c8e3f on unkind:bugfix-utf8-lexer into c620f44 on hoaproject:master.

* Regardless source of the bug, try to report about this exception to the library maintainers.
* Even if bug is yours, this exception must not happen.
*/
final class InternalError extends LogicException
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final is not required here. I think you can remove it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All exceptions must extend Hoa\Exception, see sibling exception in the same directory.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also name the class RegularExpression or PCRE, something less generic than InternalError.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

final is not required here. I think you can remove it.

Let's try to see it from another point of view: there is no final keyword. There is open one. You suggest to open for inheritance. Why? I mean, it's known idea that composition is preferable over inheritance for some objective reasons. So it could be just a language design mistake to make classes open to inheritance by default. Some modern languages, for example, fixed it. So what's the real reason to remove final open class for inheritance?

I would also name the class RegularExpression or PCRE, something less generic than InternalError.

I don't get it either: RegularExpression is a special language or pattern, depends on context, but not an error. Kinda weird for exception class name, IMO. You probably rely on namespaces, but I

I understand and respect your own coding style, but I'm curious how do you explain it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't use final in the existing code base. I prefer to address that as another PR if you don't mind :-). You are pointing to an interesting approach, and I like it, but yeah, keep things seperate a little bit :-).

About the class/exception name, maybe REgularExpression is not appropriate. But I don't find InternalError more appropriate, it's too much abstract.

The exception represents an error in the lexer. We can (i) reuse the Hoa\Compiler\Exception\Lexer exception class (https://github.com/hoaproject/Compiler/blob/master/Exception/Lexer.php), or (ii) create another one to reflect more precisely what is the origin of the exception, hence a better name than InternalError.

If you decide to re-use Exception\Lexer, I'm fine with that. I would even suggest to do that.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hoa\Compiler\Exception\Lexer doesn't look like internal error of the Lexer. In other words, internal error is library's failure, user did nothing wrong (at least how I interpret it; that's why I reused it for Text is not valid utf-8 string, you probably need to switch "lexer.unicode" setting off., obvious misconfiguration by user). It's like comparing 4xx HTTP error codes to 5xx.

I don't like idea to mix up semantically different errors into one exception.

InternalError is definitely generic name, I consider this exception as unchecked one; users shouldn't be aware of this exception at all. Thus, they shouldn't catch this type of exception directly (only like \Exception or \Throwable in specific parts of application where failure of subprogram can be handled).

The thing either works or it is broken. Classifying hypothetical errors is a waste of time for me as most of them are really hypothetical: you don't expect them, otherwise you'd just fixed the; you just make some "save point" for yourself to make sure that it works internally like you expected without surprises. It is similar to Assertion::blahBlah(): you don't care about exception class. Assertion just must not fail. If it does, your program is bugged like if you passed object as argument to integer parameter.

And regarding names. Practice says that verbs are way more important in programming than nouns. Stack trace says Lexer is broken and Lexer throws Lexer exception. I want to know what happened wrong. LexerEncounteredInvalidUtf8Sequence, TokenIsEmptyString, etc.

I also was confused when I found class Lexer in unit-tests which tried to mimic test of real Lexer. :)

But I can reuse Hoa\Compiler\Exception\Lexer.

Copy link
Author

@unkind unkind Nov 13, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgot to add: that's why there is no unit test for InternalError: it's impossible to achieve via public API of the class unless the code is actually bugged.

PhpStorm also doesn't inspect tag @throws if exception was inherited from \LogicException and \RuntimeException by default, that's why it was inherited from \LogicException.

But in the current moment PhpStorm just goes nuts with exceptions.

Exception/InternalError.php Show resolved Hide resolved
Llk/Lexer.php Outdated Show resolved Hide resolved
composer.json Outdated
@@ -28,6 +28,8 @@
},
"require": {
"php" : ">=5.5.0",
"ext-ctype" : "*",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ext/ctype?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PhpStorm thinks that it is implicit dependency: you use some function from that extension, but didn't declare this dependency. I don't remember exact function name, unfortunately.

Copy link
Member

@Hywan Hywan Nov 12, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, we are using ctype_digit here:

if (true === ctype_digit($matches[2])) {

Requiring an extension just for one call is… too much.

I'm OK to add ext-ctype here. However I'm going to open an issue to remove this dependency (done: #96).

@Hywan
Copy link
Member

Hywan commented Nov 12, 2018

Thanks for the PR! It shows the error and the patch is well-written. I've noted few things to update here and there, but they are all minors. Once they are fixed, it's ready to be merged.

Llk/Lexer.php Outdated Show resolved Hide resolved
@Majkl578
Copy link

Majkl578 commented Jan 9, 2019

Sorry, I didn't read the condition properly - it will need to be !==, not ===.

@flip111
Copy link

flip111 commented Jan 25, 2019

What's preventing this PR from getting merged now?

sanmai added a commit to sanmai/hoa-compiler that referenced this pull request Sep 17, 2019
@unkind
Copy link
Author

unkind commented Aug 8, 2020

I don't know what's going on with this one, sorry.

I still can't make git install:

Loading composer repositories with package information
Updating dependencies (including require-dev)
Your requirements could not be resolved to an installable set of packages.

  Problem 1
    - Installation request for hoa/file dev-master -> satisfiable by hoa/file[dev-master].
    - hoa/file dev-master requires hoa/iterator dev-master -> satisfiable by hoa/iterator[dev-master] but these conflict with your requirements or minimum-stability.
  Problem 2
    - Installation request for hoa/test dev-master -> satisfiable by hoa/test[dev-master].
    - hoa/test dev-master requires hoa/cli dev-master -> satisfiable by hoa/cli[dev-master] but these conflict with your requirements or minimum-stability.

And I still don't understand this mess with Consistency::flexEntity(Autocompleter::class);. If you need to keep BC, you can use class_alias, IDE understands it well. Probably, I missed something in this logic.

Closing this one.

@unkind unkind closed this Aug 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Development

Successfully merging this pull request may close these issues.

5 participants