-
Notifications
You must be signed in to change notification settings - Fork 12.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug: a macro never hit recursion limit, cargo build
run... forever!
#51754
Comments
|
... I think I could just partially understand what you say! My impression is that the macro-expansion go I tried with |
cargo build
run... forever!
So, expansion does go depth first. What's blowing up is the argument list:
In order to emit a recursion error here without first blowing the heap, I guess it would have to produce the inner token stream lazily. I'm not sure how much this could impact performance?
Arrhhh! :) The old |
Oh, thank you very much! What is still remain as a question for me is that: why the memory usage increase so slow!? When run |
This is an interesting question, so I tried taking a look into it using amateur hour theatre: `perf` edition
3.3% cache misses sounds like maybe a lot for what is mostly a bunch of similar code running without allocating memory at a fast rate; but I'm not sure. The three hottest functions identified by a basic
What are the Aside: I see calls to a function called What's calling those three functions so often? Unfortunately, impl<'a> MatcherPos<'a> {
/// Add `m` as a named match for the `idx`-th metavar.
fn push_match(&mut self, idx: usize, m: NamedMatch) {
let matches = Rc::make_mut(&mut self.matches[idx]);
matches.push(m);
}
} which potentially clones a |
By the way, when I mentioned "the old
If your crate begins with
then it will do the tracing to stdout, even though |
One last thing: Here's a flamegraph generated from a stage 1 debug compiler. This is a non-optimized compiler since the optimized debug compiler was producing callstacks that provided zero context for any of the functions of intrest (the second item in the callstack was Flamegraph: perf.svg.gz It suggests the following:
What I do not know, however, is whether these are performance bugs that can be fixed, or if they are concessions that were made in favor of optimizing hotter codepaths. |
Thanks for the info! Maybe the current |
I'm going to create a separate issue for the lack of |
(I think this should be tagged with |
This seems like a perfect fit for immutable datastructures. Only cloning if something is actually modified, otherwise just keep adding stuff to the end |
Sounds like a good idea at least from a textbook perspective; though I wonder about cache locality, and whether it really is necessary to have all of these intermediate values hanging around with one less item in the first place. The multiple levels of recursively-called drops in the flame-graph indicate to me that refcounts drop to zero on a regular basis. (that said, I myself don't really understand what the method is doing or why it is written this way) |
I still have a perverse fascination with why there's so much wasted sharing here, but still don't really understand what cc @mark-i-m who added much of this documentation recently and probably has a much better grasp of what the code is doing. Any thoughts? |
@ExpHP I assume you mean this? rust/src/libsyntax/ext/tt/macro_parser.rs Line 171 in afaa406
This field contains token trees that match against metavariables. Consider the following example:
In this example Does that make sense? Regarding the inefficiencies of the code, it's been a while since I looked at it, so I'm not sure off the top of my head if/why such things might be necessary. I don't think the original authors made too much effort to make it efficient. Recently, @nnethercote when through and reduced the number of allocations in #50855. |
Thanks for responding! I think my greater confusion is with respect to when the The comments suggest[^1] there is sharing between the subparser state in a repetition and the top level state, yet it seems that everything in [^1] I say this due to the phrase
though the next paragraph seems to contradict it:
|
Ah, I see what you mean. I'm not 100% sure, but I think there are a two different reasons why that exists:
rust/src/libsyntax/ext/tt/macro_parser.rs Line 486 in afaa406
rust/src/libsyntax/ext/tt/macro_parser.rs Line 539 in afaa406
In particular, consider the following example. We want to match Both of these new matcher positions are added to rust/src/libsyntax/ext/tt/macro_parser.rs Line 511 in afaa406
and rust/src/libsyntax/ext/tt/macro_parser.rs Line 519 in afaa406
In the second case, we clone the rust/src/libsyntax/ext/tt/macro_parser.rs Lines 486 to 498 in afaa406
Does that make sense? |
Thanks @mark-i-m, that was a huge help! I have identified the problem. Furthermore, I now realize that this is a much broader performance issue than recursive macro calls (which is actually unrelated). For instance, it also affects the following: macro_rules! idents {
($($ident:ident)*) => { };
}
idents!{
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
// ... 65536 copies of "a" in total ...
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a
} My current understanding of the issue is as follows:
The reason:
With a small hack to drop ...hopefully, some working code compiles faster as well! |
This is both good and bad news! The bad news is that if I do something like this in the future, my system (Fedora) will completely not responding (I really miss Windows went it come to this situation). Maybe it's time to learn how to use When the memory usage increase drastically, maybe we should assume that there is something that is incorrect (then stop/report an error)? |
@limira I personally use the limits built into linux. People call me crazy for doing this, but having a guard rail is nice. (so thankfully, I was joking when I said the code snippet sent my hard drive thrashing!) I don't know if this is the smartest way to do it, but:
When I need to temporarily override it, I switch to root to issue a
The reason this saves your system from locking up is because it limits how much can be moved to swap (and in my (limited) experience it is heavy usage of swap that is usually responsible for the disk thrashing and the apparent system lockup when too much memory is used by a single application in linux) |
OOM killer is a feature of the Linux kernel, not the compiler. When the kernel can't allocate any more physical memory, it begins swapping (which is the reason your system becomes largely unresponsive). (Control-c or killing the rustc process some other way would help). If the kernel also runs out of swap space, it chooses some process to "OOM kill", usually the one with the most memory usage. I think so far the policy of rustc has been to allow the compiler's memory usage to be unbounded, though there are some efforts to detect infinite loops in const fn evaluation. |
@ExpHP: Thanks for the tip! I will try it. As in this comment (above), my buggy macro
I think this is better than just let the compiler blindly run into the |
Food for the bikeshed: The current |
A mistake like what I did here is rare, but it could happen to anyone. Before the fix by ExpHP, I wait for it more than 10 minutes (on my system), |
For a fuller description of the performance issue fixed by this: #51754 (comment)
Fix macro parser quadratic complexity in small repeating groups Observed in #51754, and more easily demonstrated with the following: ```rust macro_rules! stress { ($($t:tt)+) => { }; } fn main() { stress!{ a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a // ... 65536 copies of "a" total ... a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a } } ``` which takes 50 seconds to compile prior to the fix and <1s after. I hope this has a visible impact on the compile times for real code. (I think it is most likely to affect incremental TT munchers that deal with large inputs, though it depends on how they are written) For a fuller description of the performance issue: #51754 (comment) --- There is no test (yet) because I'm not sure how easily to measure this for regressions.
Note that, even prior to my PR, there are still other macros that experience exponential blowup: (do not compile this!) macro_rules! blowup {
// (the brackets work around the performance bug)
([$($t:tt)+]) => { blowup!{[$($t:tt)+]};
}
fn main() {
blowup!([a]);
} I started a thread on internals for discussion on possible ways the compiler could help with such rogue macros: https://internals.rust-lang.org/t/thought-saving-users-from-themselves-in-macros-that-explode/7907 |
We definitely need help from the compiler in these situation! Thanks for the push! |
Fix macro parser quadratic complexity in small repeating groups Observed in #51754, and more easily demonstrated with the following: ```rust macro_rules! stress { ($($t:tt)+) => { }; } fn main() { stress!{ a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a // ... 65536 copies of "a" total ... a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a a } } ``` which takes 50 seconds to compile prior to the fix and <1s after. I hope this has a visible impact on the compile times for real code. (I think it is most likely to affect incremental TT munchers that deal with large inputs, though it depends on how they are written) For a fuller description of the performance issue: #51754 (comment) --- There is no test (yet) because I'm not sure how easily to measure this for regressions.
I know what is my mistake,
but I think there is a bug in Rust!Edit: We just need a better help from the compiler to easily spot out mistake like this.
Edit: the problem seems just involve only the first rule. So the new minimal buggy code is:
cargo build
will run forever! (well, I just wait for it for about ten minutes!)The text was updated successfully, but these errors were encountered: