Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negation in nested character classes doesn’t work for case-insensitive regexes #24615

Open
piotrrzysko opened this issue Jan 2, 2025 · 0 comments
Assignees

Comments

@piotrrzysko
Copy link
Member

It seems that when using Joni as a regex library, negation in nested character classes is not respected in case-insensitive mode. The following examples demonstrate the issue:

-- Returns true
SELECT regexp_like('q', '(?i)[[^Q]]');
-- Returns false
SELECT regexp_like('q', '(?i)[^Q]');
-- Returns true
SELECT regexp_like('q', '(?i)[A-Z&&[^q]]');
-- Returns false
SELECT regexp_like('q', '(?i)[A-Z&&[^Q]]');

In comparison, in Java (which according to the documentation is the regex syntax Trino follows), all of the above cases return false.

This discrepancy appears to occur because Joni applies case folding only to the top-level character class: link. It doesn’t do so when it encounters a nested character class while recursively parsing the top-level class: link.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants