-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
#432 is highly incompatible => add additional flag #574
Comments
PCRE2 aims to be compatible with Perl, so technically that change was a bugfix. I think preserving the old behaviour might be possible by using an EXTRA flag, but could you elaborate on why that behaviour was preferred for your use case? |
We are authors of a programming language. We don't write the regex or the subject, but the users of the programming language do. We don't know what regular expressions they write. This programming language is compatible along releases. So the same program should give the same result with the new compiler. This is the standard in all programming languages as well (with minor exceptions like Python 2 => 3). Do you know how other programming languages like PHP consume PCRE2? I guess they also try to be compatible along releases and do not invalidate user code. Do they fork PCRE or do they stay on the same version forever, ignoring all bugfixes? The best and most modern approach would be to use feature flags throughout. Is that an option? The same problems of course also hold for new features, as they also change the behavior of existing programs. We would be very unhappy to actively monitor all changelogs of PCRE, check whether the change was incompatible or not or if it introduced a new feature or not, or if it is just a bugfix. We would than have to fork our own version of PCRE2, cherry-picking or feature-flagging all changes. Over time, this version of PCRE will be quite drastically different from upstream. Really unsure what the correct approach would be here. In this particular case, I also feel that the old behavior was more consistent. But that's just personal taste. |
I have seen PHP doesn't seem to mind invalidating user code, see: php/php-src#13413 |
Just curious, but who is we and which language? PCRE2 is used in several and some have their own per language compatibility flag that might come handy on that upcoming patch of yours.
Philip seemed to agree with you, so I am sure something implementing |
Indeed, a new option seems like the best way out of this, and that seems like the appropriate name (similar to other option names). As you can see from the comments in #432, the change was made in response to user complaints. We try to please everybody, but sometimes an option is the only way to do it. Awaiting your patch.... |
Note that the change also fixes a bug in POSIX classes. Previously, [:upper:] (for example) matched only upper case when PCRE2_UCP was set, but matched both cases when PCRE2_UCP was not set. This is because [:upper:] is translated to \p{Lu} when PCRE2_UCP is set. I suppose for consistency [:upper:] should always behave in the same way as \p{Lu}. |
So, what you actually have is a hypothetical problem: you're worried that some user's regex might be broken when you release the new behaviour. That's quite a sensible worry: but we should still calibrate our mental "risk model" to see this as a fairly low-impact bugfix (brining PCRE2 into alignment with Perl and other regex engines). Keeping the old idiosyncratic behaviour has its own compatibility risks for new applications (whereas moving to the new behaviour has compatibility risks for old existing applications).
I believe most downstream consumers just move to the latest releases, and rely on PCRE2 to be sufficiently stable that they don't need to worry. In this case... it's borderline whether the
That's out of scope for PCRE2's backwards compatibility.
(Philip and Zoltan, correct me if I'm wrong!) |
Please understand, that PCRE2 is not funded in any way, so developers spend their free time on it. There are things we do, but spending a lot of effort on minor things is not considered a valuable effort (at least from my side). Adding flags to everything increases the maintenance costs, and I would avoid that. I know people often consider regex as something simple, and doing these requests are simple tasks, but I would be curious how Nick's opinion changed about the project before and after he joined it (from a newcomers perspective). |
@carenas The ABAP programming language. It is used to run large enterprise software and has very strict compatibility requirements. We are still discussing, whether we can change the behavior also in ABAP. Searching \p{Lu} with "i" flag is quite obscure. Probably no one did that. |
don't forget also the effect it has in other related properties and even potencial aliases, as pointed out by Philip in the case of |
Hi,
You introduced an incompatible change in PCRE2 with #432, which caused many of our unit tests to fail.
Would it be possible to introduce a compile flag to restore the old behavior? If so, I could supply a patch as well.
Couldn't this affect also many other software products and trigger subtle bugs?
Thanks
Kilian
The text was updated successfully, but these errors were encountered: