Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Macro expansion is quadratic in the number of let statements. #10607

Closed
huonw opened this issue Nov 22, 2013 · 13 comments · Fixed by #34570
Closed

Macro expansion is quadratic in the number of let statements. #10607

huonw opened this issue Nov 22, 2013 · 13 comments · Fixed by #34570
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.

Comments

@huonw
Copy link
Member

huonw commented Nov 22, 2013

The following (bash) shell snippet prints let _x: int; /* <number> */ 5000 times, which makes the compiler very unhappy.

$ echo 'fn main() { ' 'let _x: int; /*'{1..5000}'*/' '}'  |  rustc - -Z time-passes
time: 0.056 s   parsing
time: 0.001 s   gated feature checking
time: 0.002 s   std macros injection
time: 0.010 s   configuration 1
time: 9.157 s   expansion
[...]

(the total time is 10.3s.)

First few lines of perf report:

    32.15%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] hash::Writer$SipState::write::hadb05778635375a2fhaT::v0.9$x2dpre
    22.41%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] hashmap::HashMap::bucket_for_key_with_hash::h5b953adb9726637Rjab::v0.9$x2dpre
     9.02%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] hash::Streaming$SipState::result_u64::h1e5a7083bc4c3f6bAga7::v0.9$x2dpre
     8.46%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] hash::Hash$A::hash_keyed::h463e3d8e059f531Zha5::v0.9$x2dpre
     4.20%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] ast_util::get_sctable::he0d4d721b1287a6aH::v0.9$x2dpre
     3.08%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] hashmap::Map$HashMap::find::hf199eda9a92cb3f3CAa3::v0.9$x2dpre
     1.72%     rustc  libsyntax-2bb2d559d93ae8f0-0.9-pre.so  [.] tuple::inner::Eq$__extensions__::eq::h4384d9b129d29aa7Gjas::v0.9$x2dpre            ```

i.e. ~70% of the time is spend on hashing/hashmap look-ups. (cc #10586, although that's not the full problem.)

@eddyb
Copy link
Member

eddyb commented Nov 22, 2013

From my own tests, I can see 6 * N * N calls to hash::Hash$A::hash_keyed::anon::expr_fn::_9Cacah, where N is the number of let _x: int; statements in main.

4 * N * N from to_bytes::IterBytes$__extensions__::iter_bytes::anon::expr_fn::Uzadaf
2 * N * N from hash::Hash$A::hash_keyed::h463e3d8e059f5319Cac::v0.9$x2dpre

I have two callgrind dumps, one for N = 1000 and another for N = 500, if anyone else wants to take a look at them - I used kcachegrind myself, giving:

t5zd0tb

@eddyb
Copy link
Member

eddyb commented Nov 22, 2013

I seems the problem starts with N * N calls to:

ast_util::new_rename::h39e8665ceaa551c1aF::v0.9$x2dpre

originating from 3 * N calls to:

ext::expand::ast_fold$IdentRenamer::fold_ident::hf98b2d77f07645a0v1a9::v0.9$x2dpre

EDIT: IdentRenamer::fold_ident is O(N), and it's called approx. N times, which results in quadratic behavior.

@huonw
Copy link
Member Author

huonw commented Nov 22, 2013

The relevant part of expand makes it clear what's going wrong, it's performing an O(number of idents (in scope?)) fold for every ident in the AST.

cc @jbclements, looks like hygiene comes at a cost and/or needs more caching.

@jbclements
Copy link
Contributor

Yep, everything here makes sense. I'm not surprised that compilation time here is n^2 in the depth of 'let' nesting. It appears to me that compilation of programs that nest identifiers to a depth of 5000 is likely to take an extra ten seconds. There are additional forms of laziness that could be applied to address this problem, but I think that there are other more pressing issues. Agree/disagree?

@pnkfelix
Copy link
Member

@jbclements More pressing issues with respect to efficiency of rustc?

Or in getting macro-expansion/hygiene fixed up and/or finalized?

My thinking: As long as the experts are confident that the slowdowns here can be addressed without changing the semantics of macro expansion, then I agree we don't need to fix this immediately. (Except if not fixing it is going to make it difficult to identify other sources of inefficiency within rustc, i.e. by making it difficult to construct similar micro-benchmark inputs to rustc...)

@jbclements
Copy link
Contributor

I see nothing that would prevent (additional) laziness here, though it would be a bit of a headache; I think the most natural solution would be to introduce a "delayed-rename" AST node, so that you can be sure you don't forget to apply the rename everywhere. I take your micro-benchmark comment, though. In this case, I'm guessing that you wouldn't really want a local stack frame with 5K variables, though you might be checking to make sure that they all get optimized away. In that case, I'm guessing that 50 variables would work as well as 5000.

@paulstansifer
Copy link
Contributor

(A bit of context here: the "laziness" is the faster-but-more-complex algorithm described in the classic paper on hygiene in Scheme: http://portal.acm.org/citation.cfm?id=173617.173620 )

@pnkfelix
Copy link
Member

I was thinking of Clinger and Rees "Macros that Work", but its been a while since I looked at either carefully, maybe Dybvig "Syntactic Abstraction in Scheme" is the better reference to use at this point.

my memory is that the differences between the papers are most relevant when you start having procedural macros. Rust's macros are still much like Scheme's syntax-rules, and so I hypothesize that one could start from either document and get useful info for attacking this problem.

(If nothing else, at least these links are not behind a paywall.)

@edwardw
Copy link
Contributor

edwardw commented Feb 28, 2014

The smoking test time has just been halved: 12563 comment. Most likely, it has to do with improved mtwt::new_rename and / or mtwt::new_mark.

@edwardw
Copy link
Contributor

edwardw commented Mar 1, 2014

Another 2x speedup by specializing hasher for mtwt lookup tables, so a total 4x speed gain for now. Still, this is just implementation improvement, not an algorithmic one.

@eddyb
Copy link
Member

eddyb commented Mar 1, 2014

It matters when the code in question spends minutes in hygiene, thanks @edwardw :D.

@steveklabnik
Copy link
Member

Triage: don't think anything has happened with this in a while.

@huonw huonw changed the title Macro expansion is particularly slow on let statements. Macro expansion is quadratic in the number of let statements. Oct 12, 2015
@jseyfried
Copy link
Contributor

I fixed this in #34570.

bors added a commit that referenced this issue Jul 14, 2016
Simplify the macro hygiene algorithm

This PR removes renaming from the hygiene algorithm and treats differently marked identifiers as unequal.

This change makes the scope of identifiers in `macro_rules!` items empty. That is, identifiers in `macro_rules!` definitions do not inherit any semantics from the `macro_rules!`'s scope.

Since `macro_rules!` macros are items, the scope of their identifiers "should" be the same as that of other items; in particular, the scope should contain only items. Since all items are unhygienic today, this would mean the scope should be empty.

However, the scope of an identifier in a `macro_rules!` statement today is the scope that the identifier would have if it replaced the `macro_rules!` (excluding anything unhygienic, i.e. locals only).

To continue to support this, this PR tracks the scope of each `macro_rules!` and uses it in `resolve` to ensure that an identifier expanded from a `macro_rules!` gets a chance to resolve to the locals in the `macro_rules!`'s scope.

This PR is a pure refactoring. After this PR,
 - `syntax::ext::expand` is much simpler.
 - We can expand macros in any order without causing problems for hygiene (needed for macro modularization).
 - We can deprecate or remove today's `macro_rules!` scope easily.
 - Expansion performance improves by 25%, post-expansion memory usage decreases by ~5%.
 - Expanding a block is no longer quadratic in the number of `let` statements (fixes #10607).

r? @nrc
bors added a commit that referenced this issue Jul 15, 2016
Simplify the macro hygiene algorithm

This PR removes renaming from the hygiene algorithm and treats differently marked identifiers as unequal.

This change makes the scope of identifiers in `macro_rules!` items empty. That is, identifiers in `macro_rules!` definitions do not inherit any semantics from the `macro_rules!`'s scope.

Since `macro_rules!` macros are items, the scope of their identifiers "should" be the same as that of other items; in particular, the scope should contain only items. Since all items are unhygienic today, this would mean the scope should be empty.

However, the scope of an identifier in a `macro_rules!` statement today is the scope that the identifier would have if it replaced the `macro_rules!` (excluding anything unhygienic, i.e. locals only).

To continue to support this, this PR tracks the scope of each `macro_rules!` and uses it in `resolve` to ensure that an identifier expanded from a `macro_rules!` gets a chance to resolve to the locals in the `macro_rules!`'s scope.

This PR is a pure refactoring. After this PR,
 - `syntax::ext::expand` is much simpler.
 - We can expand macros in any order without causing problems for hygiene (needed for macro modularization).
 - We can deprecate or remove today's `macro_rules!` scope easily.
 - Expansion performance improves by 25%, post-expansion memory usage decreases by ~5%.
 - Expanding a block is no longer quadratic in the number of `let` statements (fixes #10607).

r? @nrc
flip1995 pushed a commit to flip1995/rust that referenced this issue Jun 2, 2023
Add spans to `clippy.toml` error messages

Adds spans to errors and warnings encountered when parsing `clippy.toml`.

changelog: Errors and warnings generated when parsing `clippy.toml` now point to the location in the TOML file the error/warning occurred.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
I-compiletime Issue: Problems and improvements with respect to compile times.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants