Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to compile unicode-general-category #181

Open
HollayHorvath opened this issue Nov 1, 2021 · 2 comments
Open

Attempting to compile unicode-general-category #181

HollayHorvath opened this issue Nov 1, 2021 · 2 comments

Comments

@HollayHorvath
Copy link

Hey folks,

I have tried to compile the unicode-general-category crate with polonius, but I wasn't able to do so in an acceptable timeframe (I killed the process after 1h, it was using around 40Gb of ram).

I do know that polonius is under development and this issue will not be solved for a while, but since the compile times are so different (2.5s vs at over 1h/unknown) I felt it justifies to file an issue about this.

@ecstatic-morse
Copy link

ecstatic-morse commented Nov 4, 2021

Thanks @HollayHorvath.

unicode-general-category generates a constant initializer (CATEGORY_BLOCKS) which contains a single basic block that initializes many thousands of locals one after another. Our current implementation of move/init analysis generates a single fact for each initialized variable at each CFG node (MIR statement). ~100k nodes * ~10k locals = ~1G facts.

This is a known issue, and I have a solution in mind. However, it will be difficult to be competitive with Rust's implementation due to limitations in the Datalog engines we're using (Soufflé and Datafrog).

@ecstatic-morse
Copy link

ecstatic-morse commented Nov 5, 2021

I described a pure datalog solution to this type of problem in #182, although I'm not sure whether we'll actually go that way. It might be better to integrate rustcs dataflow library into Polonius, perhaps by allowing constraints that are backed by dataflow cursors. That said, I think we might want to explore basic-block aware variants of core borrowck relations (subset, origin_contains_loan_at, etc.), and doing it for move/inits is a good first step.

Note that, in the Naive and Optimized variants, we would still need liveness data for each variable at all points, which has the same problems as initializedness. Luckily, the LocationInsensitive invariant should be able to rule out enough potential borrow errors that it won't be a problem for many examples, including this one. Pathological cases will still exist, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants