Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some sign conversion warnings emitted by clang in cpp2regex.h #1198

Closed

Conversation

bluetarpmedia
Copy link
Contributor

This PR adds some size_t casts when accessing the match_context groups std::array to prevent some clang sign-conversion warnings.

This Cpp2:

matcher: @regex type = {
    regex_number:= R"([0-9]*)";
}

main: () -> int = {
    m: matcher = ();
    res:= m.regex_number.match("0123");
    std::print("matched: {} {}", res.matched, res.group(0));
    return 0;
}

causes clang to emit some warnings & notes (edited for brevity, the repro has the full details):

warning: implicit conversion changes signedness: 'const int' to 'size_type' (aka 'unsigned long') [-Wsign-conversion]

Repro

The diff for the regenerated cpp2regex.h is very large due to new lines inserted into the h2.

CC: @MaxSagebaum

I've left this as a draft for now. I noticed that the GCC regression tests had some failures (unrelated to regex) in my repo because GCC has used different auto identifiers:

-mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto:257&, cpp2::impl::in<int>, cpp2::impl::in<int>) [with auto:257 = std::vector<int>; cpp2::impl::in<int> = const int]: Bounds safety violation
+mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto:259&, cpp2::impl::in<int>, cpp2::impl::in<int>) [with auto:259 = std::vector<int>; cpp2::impl::in<int> = const int]: Bounds safety violation

So I'm waiting to see if that changes again.

@bluetarpmedia
Copy link
Contributor Author

@jarzec Have you seen this before with GCC? The 3 GCC jobs are showing different numbers for the auto parameters.

E.g. here's the diff from the g++-14 job:

-mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto:257&, cpp2::impl::in<int>, cpp2::impl::in<int>) [with auto:257 = std::vector<int>; cpp2::impl::in<int> = const int]: Bounds safety violation
+mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto:259&, cpp2::impl::in<int>, cpp2::impl::in<int>) [with auto:259 = std::vector<int>; cpp2::impl::in<int> = const int]: Bounds safety violation

I'm not sure what those numbers are or why GCC puts them in. This is the equivalent line for the clang output:

mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto &, cpp2::impl::in<int>, cpp2::impl::in<int>) [rng:auto = std::vector<int>]: Bounds safety violation


// Private functions
//
private get_group_idx: (in this, group) -> size_t = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought "since this is always used to access an element of groups, can we make a function that handles that". I ended up with this:

    private get_group: (inout this, group) -> forward match_group<Iter> = {
        assert(group >= 0);
        return groups[group as size_t];
    }

and then used like this:

    get_group(group).start = pos;

I discovered that in order to get a modifiable match_group back, the function needed inout this and a return type of forward match_group<Iter>.

@hsutter That's a little ugly. I realize that you want to eliminate references as anything but input parameters to functions, but I hadn't really thought about what that would mean for trying to write functions like this. Is your expectation that we'd do this?

    private get_group: (inout this, group) -> *match_group<Iter> = {
        assert(group >= 0);
        return groups[group as size_t]&;
    }
        get_group(group)*.start = pos;

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to get a modifiable match_group back, the function needed inout this and a return type of forward match_group<Iter>.

In general, if you're going to return a forwarded modifiable parameter, that parameter needs to be modifiable == inout:

f: (inout x: int) -> forward int = x;

main: () = {
    xx := 42;
    f(xx) = 43;  // ok, changes xx to 43
    std::cout << xx;  // prints 43
}

Does this example answer what you were asking?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this example answer what you were asking?

No, I'm talking about writing the Cpp2 equivalent of T& vector<T>::operator[](size_t index);.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought "since this is always used to access an element of groups, can we make a function that handles that".

Yes, that was exactly my thinking at first too! 😃

Because the match_context type has both const and non-const member functions that would need to call this new helper function, it meant I had to write two overloads of the same getter function that returns either a const-ref or a mutable-ref to the element in the groups array, with duplicate implementations.

private get_group: (in this, group_idx) -> forward match_group<Iter> = {
    assert(group_idx >= 0);
    return groups[group_idx as size_t];
}

private get_group: (inout this, group_idx) -> forward match_group<Iter> = {
    assert(group_idx >= 0);
    return groups[group_idx as size_t];
}

These lower to (slightly edited for clarity):

auto get_group(auto const& group_idx) const& -> match_group<Iter> const&;
auto get_group(auto const& group_idx) &      -> match_group<Iter>&;

I'm not sure if this is the intended way to write functions that return const-ref or mutable-ref but, just like you found Greg, that was the only way I could achieve it.

(In the real version of the above, there's already a public get_group function that returns the element by value, so I chose a different name.)

I didn't like this duplication so I went with the get_group_idx helper function instead.

The duplication also made me realise I really want C++23's "deducing this" in Cpp2, so I've created #1197 for that.

@jarzec
Copy link
Contributor

jarzec commented Aug 1, 2024

@jarzec Have you seen this before with GCC? The 3 GCC jobs are showing different numbers for the auto parameters.

E.g. here's the diff from the g++-14 job:

-mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto:257&, cpp2::impl::in<int>, cpp2::impl::in<int>) [with auto:257 = std::vector<int>; cpp2::impl::in<int> = const int]: Bounds safety violation
+mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto:259&, cpp2::impl::in<int>, cpp2::impl::in<int>) [with auto:259 = std::vector<int>; cpp2::impl::in<int> = const int]: Bounds safety violation

I'm not sure what those numbers are or why GCC puts them in. This is the equivalent line for the clang output:

mixed-bounds-safety-with-assert.cpp2(11) void print_subrange(const auto &, cpp2::impl::in<int>, cpp2::impl::in<int>) [rng:auto = std::vector<int>]: Bounds safety violation

This is new to me as well. I might have some time soon to have a look at it.

@bluetarpmedia bluetarpmedia deleted the cpp2regex-group-warnings branch August 7, 2024 01:15
@hsutter
Copy link
Owner

hsutter commented Aug 7, 2024

I wasn't expecting the PR to be closed -- was this resolved some other way, or are you waiting this to be resolved via supporting forward this parameters or another issue/PR?

@gregmarr
Copy link
Contributor

gregmarr commented Aug 7, 2024

@hsutter Replaced by #1212

@bluetarpmedia
Copy link
Contributor Author

Yes, #1212 covers this and future cases like it, but I forgot to add a note here to explain why I closed this one!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants