Skip to content

<regex>: c++ regex character class case insensitive search problem #993

@AlexGuteniev

Description

@AlexGuteniev

Describe the bug
regex::icase is not handled correctly for some input.

Command-line test case

d:\Temp2>type repro.cpp
#include <iostream>
#include <regex>

using namespace std;

int main()
{
        wstring target(L" Copyright");
        wsmatch match;
        wregex regexp(L"[a-z][a-z]", regex::icase);
        if (regex_search(target.cbegin(), target.cend(), match, regexp)) {
                wcout << L"Matched: \"" << match.str() << L"\"" << endl;
        }
        return (0);
}
d:\Temp2>cl /EHsc /W4 /WX .\repro.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29009.1 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
Microsoft (R) Incremental Linker Version 14.27.29009.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj

d:\Temp2>.\repro.exe
Matched: "op"

Expected behavior

Matched: "Co"

Command-line test case 2

d:\Temp2>type repro.cpp
#include <regex>
#include <iostream>

int main()
{
        const wchar_t* test_raw_string = L"blahZblah";
        const wchar_t* test_patterns[] = { L"[Z]", L"[z]" };

        for (const auto* test_pattern : test_patterns)
        {
                const std::wstring test_string = test_raw_string;
                std::wregex case_regex(test_pattern, std::regex_constants::ECMAScript);
                std::wcmatch match1;

                std::wcout << test_pattern << L" search " << test_string << L" case sensitive = " << std::regex_search(test_string, case_regex) << L'\n';
                std::wcout << test_pattern << L" search " << test_string << L" case sensitive with match = " << std::regex_search(test_raw_string, match1, case_regex) << L'\n';

                std::wregex icase_regex(test_pattern, std::regex_constants::icase | std::regex_constants::ECMAScript);
                std::wcmatch match2;

                std::wcout << test_pattern << L" search " << test_string << L" case insensitive = " << std::regex_search(test_string, icase_regex) << L'\n';
                std::wcout << test_pattern << L" search " << test_string << L" case insensitive with match = " << std::regex_search(test_raw_string, match2, icase_regex) << L'\n';

                std::wcout << L'\n';
        }
    return 0;
}


d:\Temp2>cl /EHsc /W4 /WX .\repro.cpp
Microsoft (R) C/C++ Optimizing Compiler Version 19.27.29009.1 for x86
Copyright (C) Microsoft Corporation.  All rights reserved.

repro.cpp
Microsoft (R) Incremental Linker Version 14.27.29009.1
Copyright (C) Microsoft Corporation.  All rights reserved.

/out:repro.exe
repro.obj

d:\Temp2>.\repro.exe
[Z] search blahZblah case sensitive = 1
[Z] search blahZblah case sensitive with match = 1
[Z] search blahZblah case insensitive = 0
[Z] search blahZblah case insensitive with match = 0

[z] search blahZblah case sensitive = 0
[z] search blahZblah case sensitive with match = 0
[z] search blahZblah case insensitive = 0
[z] search blahZblah case insensitive with match = 0

Expected behavior 2

[Z] search blahZblah case sensitive = 1
[Z] search blahZblah case sensitive with match = 1
[Z] search blahZblah case insensitive = 1
[Z] search blahZblah case insensitive with match = 1

[z] search blahZblah case sensitive = 0
[z] search blahZblah case sensitive with match = 0
[z] search blahZblah case insensitive = 1
[z] search blahZblah case insensitive with match = 1

STL version

Microsoft Visual Studio Professional 2019 Preview
Version 16.7.0 Preview 3.1

Additional context
This item is also tracked on Developer Community as DevCom-230267 and DevCom-246258 and by Microsoft-internal VSO-287844 / AB#287844.

See also #405

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingfixedSomething works now, yay!

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions