Skip to content

Conversation

ju1ius
Copy link
Contributor

@ju1ius ju1ius commented Jul 29, 2016

mb_ereg, mb_ereg_search_regs & mb_ereg_search_getregs returned
only numbered capturing groups.
Now they return both numbered and named capturing groups.
Fixes Bug #72704.

`mb_ereg`, `mb_ereg_search_regs` & `mb_ereg_search_getregs` returned
only numbered capturing groups.
Now they return both numbered and named capturing groups.
Fixes Bug #72704.
@ju1ius ju1ius force-pushed the onig-named-captures branch from 50770bd to 1eb4e4c Compare July 29, 2016 07:06
@cmb69
Copy link
Member

cmb69 commented Jul 29, 2016

I'm not sure that the current behavior can be regarded as bug (the documentation isn't particular clear wrt. named subpatterns) . Anyway, adding the named groups would cause a BC break (some existing code might use count($regs), for instance).

It might be better to treat this as enhancement, and to apply to master (or maybe PHP 7.1) only.

@ju1ius
Copy link
Contributor Author

ju1ius commented Jul 29, 2016

I'm not sure that the current behavior can be regarded as bug

Does this look right to you ?

mb_ereg_search_init('foobaroo');
mb_ereg_search_regs('f(?<a>oo)(bar)(\k<a>)');
// => ['foobaroo', 'oo']

If named groups are officially not supported, then shouldn't ONIG_OPTION_CAPTURE_GROUP be set by default so that named subpatterns do not prevent numbered ones from being captured?

the documentation isn't particular clear

😆

It might be better to treat this as enhancement, and to apply to master (or maybe PHP 7.1) only.

Fair enough. Which one do you advise?

@cmb69
Copy link
Member

cmb69 commented Jul 30, 2016

Does this look right to you ?

No, that looks like a bug to me! I have to admit that I have basically no experience with mb_ereg*, and that I didn't carefully read your bug report and this PR, and that I assumed that named capturing groups would have been already returned as elements with consecutive (non-overlapping) numeric keys.

Fair enough. Which one do you advise?

In hindsight, at the very least PHP 7.1 (which is in beta, so no new features, but certainly even BC-breaking bug fixes are okay), but maybe even PHP 5.6. Not sure, but we might only break already broken code.

Wrt. the mb_ereg* docs: indeed, they need to be improved…

@ju1ius
Copy link
Contributor Author

ju1ius commented Jul 30, 2016

I have to admit that I have basically no experience with mb_ereg*

Yeah judging by the number of comments, it is not widely used, which is a shame because Oniguruma is awesome.

I assumed that named capturing groups would have been already returned as elements with consecutive (non-overlapping) numeric keys.

The thing is that by default, Oniguruma turns numbered subpatterns to non-capturing when any named subpattern is present. To change this behaviour you must use the ONIG_OPTION_CAPTURE_GROUP (numbered & named are always capturing) or ONIG_OPTION_DONT_CAPTURE_GROUP (numbered is never capturing) options, as stated in the docs.
Unfortunately the mbstring extension does not expose these and does not implement named captures, so you end up with this weird behavior.

Anyway, I ended up submitting #2044 to master, in which I also intend to implement named captures in mb_ereg_replace.

@yohgaki
Copy link
Contributor

yohgaki commented Aug 27, 2016

Close this one and keep open #2044 ?

@cmb69
Copy link
Member

cmb69 commented Aug 28, 2016

@yohgaki The basic question is whether we consider the current behavior as a bug, see https://3v4l.org/78onQ, for instance. If it is a bug, and we can do the respective fix without breaking BC (mb_regex_set_options() appear to allow that; thanks!), it should go into PHP 5.6, in my opinion.

@php-pulls
Copy link

Comment on behalf of krakjoe at php.net:

Since this PR targets a security only branch, and since there is another PR targeting 7, I'm closing this PR.

@php-pulls php-pulls closed this Jan 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants