Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

exhaustiveness checking hangs forever when compiling pulsar-rs #118437

Closed
xxchan opened this issue Nov 29, 2023 · 21 comments · Fixed by #118796
Closed

exhaustiveness checking hangs forever when compiling pulsar-rs #118437

xxchan opened this issue Nov 29, 2023 · 21 comments · Fixed by #118796
Assignees
Labels
A-exhaustiveness-checking Relating to exhaustiveness / usefulness checking of patterns C-bug Category: This is a bug. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly.

Comments

@xxchan
Copy link
Contributor

xxchan commented Nov 29, 2023

git clone https://github.com/streamnative/pulsar-rs
cd pulsar-rs
cargo +nightly check

Then it hangs forever

bisect leads to ee80c8d
cc @Nadrieril @cjgillot

searched nightlies: from nightly-2023-11-26 to nightly-2023-11-29
regressed nightly: nightly-2023-11-27
searched commit range: f5dc265...6cf0888
regressed commit: ee80c8d

bisected with cargo-bisect-rustc v0.6.7

Host triple: aarch64-apple-darwin
Reproduce with:

cargo bisect-rustc --start 2023-11-26 --script ./test.sh 

bisect script:

#!/bin/sh

set -ex

timeout 40s cargo check
@xxchan xxchan added C-bug Category: This is a bug. regression-untriaged Untriaged performance or correctness regression. labels Nov 29, 2023
@rustbot rustbot added I-prioritize Issue: Indicates that prioritization has been requested for this issue. needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. labels Nov 29, 2023
@apiraino
Copy link
Contributor

@xxchan thanks for the report! Do you think you can help in producing a bit smaller reproducible of this issue? That would be great to build a test case and help the debugging.

Note: compiling the entire project will eat all the memory available

@rustbot label +E-needs-mcve +I-compilemem

@rustbot rustbot added E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. labels Nov 29, 2023
@Nadrieril
Copy link
Member

I minimized it to this giant match statement, which still depends on the generated protobuf types. I'm working on reducing further

@Nadrieril
Copy link
Member

Nadrieril commented Nov 29, 2023

Minimized to this:

struct BaseCommand {
    ack: bool,
    ack_response: bool,
    active_consumer_change: bool,
    add_partition_to_txn: bool,
    add_partition_to_txn_response: bool,
    add_subscription_to_txn: bool,
    add_subscription_to_txn_response: bool,
    auth_challenge: bool,
    auth_response: bool,
    close_consumer: bool,
    close_producer: bool,
    connect: bool,
    connected: bool,
    consumer_stats: bool,
    consumer_stats_response: bool,
    end_txn: bool,
    end_txn_on_partition: bool,
    end_txn_on_partition_response: bool,
    end_txn_on_subscription: bool,
    end_txn_on_subscription_response: bool,
    end_txn_response: bool,
    error: bool,
    flow: bool,
    get_last_message_id: bool,
    get_last_message_id_response: bool,
    get_or_create_schema: bool,
    get_or_create_schema_response: bool,
    get_schema: bool,
    get_schema_response: bool,
    get_topics_of_namespace: bool,
    get_topics_of_namespace_response: bool,
    lookup_topic: bool,
    lookup_topic_response: bool,
    message: bool,
    new_txn: bool,
    new_txn_response: bool,
    partition_metadata: bool,
    partition_metadata_response: bool,
    ping: bool,
    pong: bool,
    producer: bool,
    producer_success: bool,
    reached_end_of_topic: bool,
    redeliver_unacknowledged_messages: bool,
    seek: bool,
    send: bool,
    send_error: bool,
    send_receipt: bool,
    subscribe: bool,
    success: bool,
    tc_client_connect_request: bool,
    tc_client_connect_response: bool,
    unsubscribe: bool,
}

fn request_key(command: BaseCommand) {
    match command {
        BaseCommand {
            subscribe: true, ..
        } => {}
        BaseCommand {
            partition_metadata: true,
            ..
        } => {}
        BaseCommand {
            partition_metadata_response: true,
            ..
        } => {}
        BaseCommand {
            lookup_topic: true, ..
        } => {}
        BaseCommand {
            lookup_topic_response: true,
            ..
        } => {}
        BaseCommand { producer: true, .. } => {}
        BaseCommand {
            producer_success: true,
            ..
        } => {}
        BaseCommand {
            unsubscribe: true, ..
        } => {}
        BaseCommand { seek: true, .. } => {}
        BaseCommand {
            close_producer: true,
            ..
        } => {}
        BaseCommand { success: true, .. } => {}
        BaseCommand { error: true, .. } => {}
        BaseCommand {
            consumer_stats: true,
            ..
        } => {}
        BaseCommand {
            consumer_stats_response: true,
            ..
        } => {}
        BaseCommand {
            get_last_message_id: true,
            ..
        } => {}
        BaseCommand {
            get_last_message_id_response: true,
            ..
        } => {}
        BaseCommand {
            get_topics_of_namespace: true,
            ..
        } => {}
        BaseCommand {
            get_topics_of_namespace_response: true,
            ..
        } => {}
        BaseCommand {
            get_schema: true, ..
        } => {}
        BaseCommand {
            get_schema_response: true,
            ..
        } => {}
        BaseCommand { send: true, .. } => {}
        BaseCommand {
            send_error: true, ..
        } => {}
        BaseCommand {
            send_receipt: true, ..
        } => {}
        BaseCommand {
            active_consumer_change: true,
            ..
        } => {}
        BaseCommand { message: true, .. } => {}
        BaseCommand { flow: true, .. } => {}
        BaseCommand {
            redeliver_unacknowledged_messages: true,
            ..
        } => {}
        BaseCommand {
            reached_end_of_topic: true,
            ..
        } => {}
        BaseCommand { ack: true, .. } => {}
        BaseCommand {
            close_consumer: true,
            ..
        } => {}
        BaseCommand {
            auth_challenge: true,
            ..
        } => {}
        BaseCommand { connect: true, .. } => {}
        BaseCommand {
            connected: true, ..
        } => {}
        BaseCommand { ping: true, .. } => {}
        BaseCommand { pong: true, .. } => {}

        _ => {}
    }
}

@Nadrieril
Copy link
Member

What surprises me is not that this hangs exhaustiveness checking, it's that it worked before #117611 x)

@Nadrieril
Copy link
Member

Nadrieril commented Nov 29, 2023

Ah no I get why. Well this is literally the worst case scenario for exhaustiveness. It's going through the 2^number_of_fields possibilities to determine which arms are reachable

@Nadrieril
Copy link
Member

This is terrible, I think the premise of #117611 might be fully broken :'(

@Nadrieril
Copy link
Member

Nadrieril commented Nov 29, 2023

Well, a related case hangs on stable too: playground

At that point we're hitting the fact that exhaustiveness is NP-hard, so occasional exponential behavior is unavoidable.

@xxchan
Copy link
Contributor Author

xxchan commented Nov 29, 2023

Do you think practically it's better to rewrite such code to workaround this issue?

@Nadrieril
Copy link
Member

It's definitely a case that stretches the limits of what we can handle. If the struct only had 20 fields it would compile fine.

I do think this is a reasonable case though. I intend to revert my PR if I can't find a better solution

@apiraino
Copy link
Contributor

apiraino commented Nov 29, 2023

hey folks, thanks for digging deep into this. Let me know if it would be useful to loop in the compiler team. I can nominate this issue and have them have a look at the next meeting (tomorrow Thu. 30th)

@Nadrieril
Copy link
Member

I think we'd only need their help if this requires an emergency revert/backport. Do you happen to know how long we have before this makes it into beta?

@apiraino
Copy link
Contributor

If I read this page correctly, the next beta cut will be on Dec, 20th

@Nadrieril
Copy link
Member

Ok that's perfect, thanks for figuring this out. Gives me enough time to figure out a proper solution

@Nadrieril Nadrieril added the A-exhaustiveness-checking Relating to exhaustiveness / usefulness checking of patterns label Nov 29, 2023
@estebank
Copy link
Contributor

@Nadrieril regardless of whether the current algorithm is kept (tweaked) or reverted, it'd be nice if we could somehow detect "the number of match arm conditions is high too high" and have an eager warning for it to gently nudge people away from it.

@Nadrieril
Copy link
Member

Oh yeah, good idea! That would make everyone's life much easier. I have an idea of how to detect that

@Nadrieril
Copy link
Member

I found a fix! I'll let #118308 get reviewed first because it would conflict; I'll submit a PR after that

@saethlin saethlin removed the needs-triage This issue may need triage. Remove it if it has been sufficiently triaged. label Nov 30, 2023
@apiraino
Copy link
Contributor

apiraino commented Nov 30, 2023

WG-prioritization assigning priority (Zulip discussion).

Also, it does not reproduce on the latest beta, so it's a nightly regression.

@rustbot label -I-prioritize +P-high -E-needs-mcve

@rustbot rustbot added P-high High priority and removed I-prioritize Issue: Indicates that prioritization has been requested for this issue. E-needs-mcve Call for participation: This issue has a repro, but needs a Minimal Complete and Verifiable Example labels Nov 30, 2023
@apiraino apiraino removed the regression-untriaged Untriaged performance or correctness regression. label Nov 30, 2023
@apiraino apiraino added the regression-from-stable-to-nightly Performance or correctness regression from stable to nightly. label Nov 30, 2023
@Nadrieril
Copy link
Member

Nadrieril commented Dec 1, 2023

To avoid cases like this in the future I'm thinking of adding a resource counter to exhaustiveness checking, so we can abort with a clear message instead of hanging. I know rust-analyzer wants that too.

@xxchan xxchan changed the title rustc hangs forever when compiling pulsar-rs exhaustiveness checking hangs forever when compiling pulsar-rs Dec 2, 2023
bors added a commit to rust-lang-ci/rust that referenced this issue Dec 10, 2023
…, r=<try>

Exhaustiveness: Improve performance on wide matches

rust-lang#118437 revealed an exponential case in exhaustiveness checking. While [exponential cases are unavoidable](https://compilercrim.es/rust-np/), this one only showed up after my rust-lang#117611 rewrite of the algorithm. I remember anticipating a case like this and dismissing it as unrealistic, but here we are :').

The tricky match is as follows:
```rust
match command {
    BaseCommand { field01: true, .. } => {}
    BaseCommand { field02: true, .. } => {}
    BaseCommand { field03: true, .. } => {}
    BaseCommand { field04: true, .. } => {}
    BaseCommand { field05: true, .. } => {}
    BaseCommand { field06: true, .. } => {}
    BaseCommand { field07: true, .. } => {}
    BaseCommand { field08: true, .. } => {}
    BaseCommand { field09: true, .. } => {}
    BaseCommand { field10: true, .. } => {}
    // ...20 more of the same

    _ => {}
}
```

To fix this, this PR formalizes a concept of "relevancy" (naming is hard) that was already used to decide what patterns to report. Now we track it for every row, which in wide matches like the above can drastically cut on the number of cases we explore. After this fix, the above match is checked with linear-many cases instead of exponentially-many.

Fixes rust-lang#118437

r? `@cjgillot`
@Nadrieril
Copy link
Member

Fix is up, if anyone can merge it before the beta cutoff: #118796

@Nadrieril
Copy link
Member

@apiraino the beta cutoff date is approaching, do you know the procedure in a case like this? #118796 is not trivial, it could be introducing new bugs. Would it make sense to wait for beta cutoff, merge #118796, and if no bug is revealed after a week or two we backport it?

@apiraino
Copy link
Contributor

@Nadrieril yeah I agree to first ensure #118796 is reviewed, then we can always backport it.

thanks for that patch!

@bors bors closed this as completed in 1a086e4 Dec 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-exhaustiveness-checking Relating to exhaustiveness / usefulness checking of patterns C-bug Category: This is a bug. I-compilemem Issue: Problems and improvements with respect to memory usage during compilation. P-high High priority regression-from-stable-to-nightly Performance or correctness regression from stable to nightly.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants