Skip to content

Use the same name of a capture group in different alternatives of disjunction (| operator) #1128

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
fedy-cz opened this issue Nov 16, 2023 · 1 comment

Comments

@fedy-cz
Copy link

fedy-cz commented Nov 16, 2023

It looks like the regex crate requires that all of the named capture groups have a unique name. Maybe it could allow use of the same name in different branches of the disjuction operator (|).

For example the javascript regex engine allows this (quote):

All names must be unique within the same pattern. Multiple named capturing groups with the same name result in a syntax error.
This restriction is relaxed if the duplicate named capturing groups are not in the same disjunction alternative, so for any string input, only one named capturing group can actually be matched. This is a much newer feature, so check browser compatibility before using it.

/(?<year>\d{4})-\d{2}|\d{2}-(?<year>\d{4})/;
// Works; "year" can either come before or after the hyphen

Could make some "simple parsers" a little bit nicer.

@BurntSushi
Copy link
Member

Duplicate of #492.

Note that regex-automata lets you search for multiple regexes with duplicative capture names. That likely achieves the effect you want here, but you'll need to drop down to regex-automata which exists at a bit of a lower level of abstraction. The easiest path will be to use the meta regex engine.

@BurntSushi BurntSushi closed this as not planned Won't fix, can't repro, duplicate, stale Nov 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants