-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
large number of obligation produced in async code causing compiler stuck #103423
Comments
Probably also related to RPITIT, as we make a intensive use of zero-overhead async trait with GAT: impl<I: HummockIterator> MergeIteratorNext for OrderedMergeIteratorInner<I> {
type HummockResultFuture<'a> = impl Future<Output = HummockResult<()>> + 'a;
fn next_inner(&mut self) -> Self::HummockResultFuture<'_> {
self.next_inner_inner()
}
} |
I wonder if this was due to #100980. |
I’ll do a bisect tomorrow 🤣 |
It's probably possible to just revert 43119d6 and check if that makes the program no longer slow to compile. |
Yeah no prob, I'll have a try. |
By "worked perfectly", you mean the slowness is gone? In that case, this issue probably needs minimization and I can then look at why the fix in #100980 is incomplete. |
Yes. It compiles within 2 minutes. Previously it seems never end, I wait at least 5 minutes before killing cargo.
Sure, I'll try minimizing it. From the PR description it sounds like a generator issue? I'll try combining futures_async_stream with some complex types, which should reflect how our code work. |
still trying to figure it out -- I've written some programs to generate 5k+ obligations, but it doesn't time out as the original code does. Anyway, still trying to make a MCVE. |
Thanks so much for looking into this. I might also investigate, but I'm quite busy 😓 |
In the original code, the compiler resolves 50 obligations very slowly at each call side, about 1sec per call site. So I guess it's more related to the obligations itself instead of the numbers. |
No prob, just wait until I got a minimum reproducible code, and it would be more efficient for you to look into 🤣 |
WG-prioritization assigning priority (Zulip discussion). @rustbot label -I-prioritize +P-high |
Should we revert before the beta cutoff and try again? The diff isn't very complex after all |
Though might be fixed by #103509, I'll probably still look into this and make a mcve based on the reverted commit. |
It would be very useful if you created an MCVE that demonstrates the regression, for a future attempt to fix it. Thanks @skyzh. |
This took me days. I originally thought the root cause lies in one module of our system, and it turned out that a combination of multiple modules are causing this issue. MCVE here: #![feature(generators)]
#![feature(trait_alias)]
#![feature(type_alias_impl_trait)]
use std::{borrow::Cow, convert::Infallible};
use futures::{executor::block_on, pin_mut, Future, Stream, StreamExt};
use futures_async_stream::stream;
pub trait AsyncIterator: 'static {
type NextFuture<'a>: Future<Output = Result<(), Infallible>> + 'a
where
Self: 'a;
fn next(&mut self) -> Self::NextFuture<'_>;
}
pub struct GoodIterator;
impl AsyncIterator for GoodIterator {
type NextFuture<'a> = impl Future<Output = Result<(), Infallible>> + 'a;
fn next(&mut self) -> Self::NextFuture<'_> {
async move { futures::future::pending().await }
}
}
pub struct BadIterator<const T: usize, I: AsyncIterator> {
children: Vec<I>,
}
impl<const T: usize, I: AsyncIterator> BadIterator<T, I> {
pub fn new(children: Vec<I>) -> Self {
Self { children }
}
}
impl<const T: usize, I: AsyncIterator> AsyncIterator for BadIterator<T, I> {
type NextFuture<'a> = impl Future<Output = Result<(), Infallible>> + 'a;
fn next(&mut self) -> Self::NextFuture<'_> {
async move {
for i in self.children.iter_mut() {
i.next().await?;
}
Ok(())
}
}
}
pub enum IteratorUnion<I1: AsyncIterator, I2: AsyncIterator, I3: AsyncIterator, I4: AsyncIterator> {
I1(I1),
I2(I2),
I3(I3),
I4(I4),
}
impl<I1: AsyncIterator, I2: AsyncIterator, I3: AsyncIterator, I4: AsyncIterator> AsyncIterator
for IteratorUnion<I1, I2, I3, I4>
{
type NextFuture<'a> = impl Future<Output = Result<(), Infallible>> + 'a;
fn next(&mut self) -> Self::NextFuture<'_> {
async move {
match self {
Self::I1(x) => x.next().await,
Self::I2(x) => x.next().await,
Self::I3(x) => x.next().await,
Self::I4(x) => x.next().await,
}
}
}
}
fn create_iterator<const D: usize>() -> impl AsyncIterator {
let iter1 = IteratorUnion::I1(BadIterator::<D, _>::new(vec![GoodIterator, GoodIterator]));
let iter2 = IteratorUnion::I2(BadIterator::<D, _>::new(vec![GoodIterator, GoodIterator]));
let iter3 = IteratorUnion::I3(BadIterator::<D, _>::new(vec![GoodIterator, GoodIterator]));
let iter4 = IteratorUnion::I4(BadIterator::<D, _>::new(vec![GoodIterator, GoodIterator]));
BadIterator::<233, _>::new(vec![iter1, iter2, iter3, iter4])
}
fn create_iterator_2<const D: usize>() -> impl AsyncIterator {
let iter1 = IteratorUnion::I1(create_iterator::<1>());
let iter2 = IteratorUnion::I2(create_iterator::<2>());
let iter3 = IteratorUnion::I3(create_iterator::<3>());
let iter4 = IteratorUnion::I4(create_iterator::<4>());
BadIterator::<D, _>::new(vec![iter1, iter2, iter3, iter4])
}
fn create_iterator_3<const D: usize>() -> impl AsyncIterator {
let iter1 = IteratorUnion::I1(create_iterator_2::<1>());
let iter2 = IteratorUnion::I2(create_iterator_2::<2>());
let iter3 = IteratorUnion::I3(create_iterator_2::<3>());
let iter4 = IteratorUnion::I4(create_iterator_2::<4>());
BadIterator::<D, _>::new(vec![iter1, iter2, iter3, iter4])
}
fn create_iterator_4<const D: usize>() -> impl AsyncIterator {
let iter1 = IteratorUnion::I1(create_iterator_3::<1>());
let iter2 = IteratorUnion::I2(create_iterator_3::<2>());
let iter3 = IteratorUnion::I3(create_iterator_3::<3>());
let iter4 = IteratorUnion::I4(create_iterator_3::<4>());
BadIterator::<D, _>::new(vec![iter1, iter2, iter3, iter4])
}
fn create_iterator_5<const D: usize>() -> impl AsyncIterator {
let iter1 = IteratorUnion::I1(create_iterator_4::<1>());
let iter2 = IteratorUnion::I2(create_iterator_4::<2>());
let iter3 = IteratorUnion::I3(create_iterator_4::<3>());
let iter4 = IteratorUnion::I4(create_iterator_4::<4>());
BadIterator::<D, _>::new(vec![iter1, iter2, iter3, iter4])
}
type MyFavoriteIterator = impl AsyncIterator;
fn create_iterator_6(x: usize) -> MyFavoriteIterator {
create_iterator_4::<23333>()
}
#[stream(item = (Cow<'a, ()>, Cow<'a, ()>))]
async fn next<'a>(mut i: impl AsyncIterator) {
loop {
match i.next().await {
Ok(_) => {}
Err(_) => {}
}
yield (Cow::Owned(()), Cow::Owned(()));
}
}
#[stream(item = (Cow<'a, ()>, Cow<'a, ()>))]
async fn next2<'a, S1, S2>( s1: S1, s2: S2)
where
S1: Stream<Item = (Cow<'a, ()>, Cow<'a, ()>)> + 'a,
S2: Stream<Item = (Cow<'a, ()>, Cow<'a, ()>)>+ 'a,
{
pin_mut!(s1);
pin_mut!(s2);
s1.next().await.unwrap();
s2.next().await.unwrap();
}
macro_rules! repeat {
($x:expr) => ();
($x:expr, $($y:expr),+) => {
block_on(async {
let iter1 = create_iterator_6(1);
let iter1 = next(iter1);
let iter2 = create_iterator_6(1);
let iter2 = next(iter2);
let iter = next2(iter1, iter2);
pin_mut!(iter);
while let Some(_) = iter.next().await {}
});
repeat!($($y),+);
};
}
fn main() {
repeat!(
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1, 1, 1, 1, 1, 1
);
} Basically, we are doing 3 things here:
On 758f196 (some commit on main), compiler will stuck. On nightly-2022-07-29 and #103509, this works perfectly, Interestingly, without
None of them will stuck. Therefore, I guess there's problem with |
will ICE on 758f196 (some commit on main) |
Cargo.toml [package]
name = "xxx"
version = "0.1.0"
edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies]
futures = "0.3"
futures-async-stream = { git = "https://github.com/taiki-e/futures-async-stream" } |
I found a more simplified MCVE when workaround this bug in the codebase, without futures-async-stream. Trying to construct it... |
@skyzh do you have a branch of the full risingwave project that I can test the regression on? |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Okay, I probably did the compile in a wrong way. Testing again, sorry for that 🙈 |
Here's what I did:
Unluckily, https://github.com/compiler-errors/rust/tree/issue-103423 doesn't seem to fix the issue. The rustc process keeps spinning at 100% CPU and seems never to complete when compiling |
The branch for testing -> https://github.com/risingwavelabs/risingwave/tree/skyzh/revert-workrounds |
…-r-hard, r=oli-obk Revert "Normalize opaques with escaping bound vars" This caused a perf regression in rust-lang#103423, cc `@skyzh` this should fix rust-lang#103423. reverts rust-lang#100980 r? `@oli-obk`
…th-late-bound-vars-again, r=jackh726 Normalize opaques with late-bound vars again We have a hack in the compiler where if an opaque has escaping late-bound vars, we skip revealing it even though we *could* reveal it from a technical perspective. First of all, this is weird, since we really should be revealing all opaques in `Reveal::All` mode. Second of all, it causes subtle bugs (linked below). I attempted to fix this in rust-lang#100980, which was unfortunately reverted due to perf regressions on codebases that used really deeply nested futures in some interesting ways. The worst of which was rust-lang#103423, which caused the project to hang on build. Another one was rust-lang#104842, which was just a slow-down, but not a hang. I took some time afterwards to investigate how to rework `normalize_erasing_regions` to take advantage of better caching, but that effort kinda fizzled out (rust-lang#104133). However, recently, I was made aware of more bugs whose root cause is not revealing opaques during codegen. That made me want to fix this again -- in the process, interestingly, I took the the minimized example from rust-lang#103423 (comment), and it doesn't seem to hang any more... Thinking about this harder, there have been some changes to the way we lower and typecheck async futures that may have reduced the pathologically large number of outlives obligations (see description of rust-lang#103423) that we were encountering when normalizing opaques with bound vars the last time around: * rust-lang#104321 (lower `async { .. }` directly as a generator that implements `Future`, removing the `from_generator` shim) * rust-lang#104833 (removing an `identity_future` fn that was wrapping desugared future generators) ... so given that I can see: * No significant regression on rust perf bot (rust-lang#107620 (comment)) * No timeouts in crater run I did (rust-lang#107620 (comment), rechecked failing crates in rust-lang#107620 (comment)) ... and given that this PR: * Fixes rust-lang#104601 * Fixes rust-lang#107557 * Fixes rust-lang#109464 * Allows us to remove a `DefiningAnchor::Bubble` from codegen (75a8f68) I'm inclined to give this another shot at landing this. Best case, it just works -- worst case, we get more examples to study how we need to improve the compiler to make this work. r? types
…ound-vars-again, r=jackh726 Normalize opaques with late-bound vars again We have a hack in the compiler where if an opaque has escaping late-bound vars, we skip revealing it even though we *could* reveal it from a technical perspective. First of all, this is weird, since we really should be revealing all opaques in `Reveal::All` mode. Second of all, it causes subtle bugs (linked below). I attempted to fix this in #100980, which was unfortunately reverted due to perf regressions on codebases that used really deeply nested futures in some interesting ways. The worst of which was #103423, which caused the project to hang on build. Another one was #104842, which was just a slow-down, but not a hang. I took some time afterwards to investigate how to rework `normalize_erasing_regions` to take advantage of better caching, but that effort kinda fizzled out (#104133). However, recently, I was made aware of more bugs whose root cause is not revealing opaques during codegen. That made me want to fix this again -- in the process, interestingly, I took the the minimized example from rust-lang/rust#103423 (comment), and it doesn't seem to hang any more... Thinking about this harder, there have been some changes to the way we lower and typecheck async futures that may have reduced the pathologically large number of outlives obligations (see description of #103423) that we were encountering when normalizing opaques with bound vars the last time around: * #104321 (lower `async { .. }` directly as a generator that implements `Future`, removing the `from_generator` shim) * #104833 (removing an `identity_future` fn that was wrapping desugared future generators) ... so given that I can see: * No significant regression on rust perf bot (rust-lang/rust#107620 (comment)) * No timeouts in crater run I did (rust-lang/rust#107620 (comment), rechecked failing crates in rust-lang/rust#107620 (comment)) ... and given that this PR: * Fixes #104601 * Fixes #107557 * Fixes #109464 * Allows us to remove a `DefiningAnchor::Bubble` from codegen (75a8f681837c70051e0200a14f58ae07dbe58e66) I'm inclined to give this another shot at landing this. Best case, it just works -- worst case, we get more examples to study how we need to improve the compiler to make this work. r? types
…ound-vars-again, r=jackh726 Normalize opaques with late-bound vars again We have a hack in the compiler where if an opaque has escaping late-bound vars, we skip revealing it even though we *could* reveal it from a technical perspective. First of all, this is weird, since we really should be revealing all opaques in `Reveal::All` mode. Second of all, it causes subtle bugs (linked below). I attempted to fix this in #100980, which was unfortunately reverted due to perf regressions on codebases that used really deeply nested futures in some interesting ways. The worst of which was #103423, which caused the project to hang on build. Another one was #104842, which was just a slow-down, but not a hang. I took some time afterwards to investigate how to rework `normalize_erasing_regions` to take advantage of better caching, but that effort kinda fizzled out (#104133). However, recently, I was made aware of more bugs whose root cause is not revealing opaques during codegen. That made me want to fix this again -- in the process, interestingly, I took the the minimized example from rust-lang/rust#103423 (comment), and it doesn't seem to hang any more... Thinking about this harder, there have been some changes to the way we lower and typecheck async futures that may have reduced the pathologically large number of outlives obligations (see description of #103423) that we were encountering when normalizing opaques with bound vars the last time around: * #104321 (lower `async { .. }` directly as a generator that implements `Future`, removing the `from_generator` shim) * #104833 (removing an `identity_future` fn that was wrapping desugared future generators) ... so given that I can see: * No significant regression on rust perf bot (rust-lang/rust#107620 (comment)) * No timeouts in crater run I did (rust-lang/rust#107620 (comment), rechecked failing crates in rust-lang/rust#107620 (comment)) ... and given that this PR: * Fixes #104601 * Fixes #107557 * Fixes #109464 * Allows us to remove a `DefiningAnchor::Bubble` from codegen (75a8f681837c70051e0200a14f58ae07dbe58e66) I'm inclined to give this another shot at landing this. Best case, it just works -- worst case, we get more examples to study how we need to improve the compiler to make this work. r? types
…ound-vars-again, r=jackh726 Normalize opaques with late-bound vars again We have a hack in the compiler where if an opaque has escaping late-bound vars, we skip revealing it even though we *could* reveal it from a technical perspective. First of all, this is weird, since we really should be revealing all opaques in `Reveal::All` mode. Second of all, it causes subtle bugs (linked below). I attempted to fix this in #100980, which was unfortunately reverted due to perf regressions on codebases that used really deeply nested futures in some interesting ways. The worst of which was #103423, which caused the project to hang on build. Another one was #104842, which was just a slow-down, but not a hang. I took some time afterwards to investigate how to rework `normalize_erasing_regions` to take advantage of better caching, but that effort kinda fizzled out (#104133). However, recently, I was made aware of more bugs whose root cause is not revealing opaques during codegen. That made me want to fix this again -- in the process, interestingly, I took the the minimized example from rust-lang/rust#103423 (comment), and it doesn't seem to hang any more... Thinking about this harder, there have been some changes to the way we lower and typecheck async futures that may have reduced the pathologically large number of outlives obligations (see description of #103423) that we were encountering when normalizing opaques with bound vars the last time around: * #104321 (lower `async { .. }` directly as a generator that implements `Future`, removing the `from_generator` shim) * #104833 (removing an `identity_future` fn that was wrapping desugared future generators) ... so given that I can see: * No significant regression on rust perf bot (rust-lang/rust#107620 (comment)) * No timeouts in crater run I did (rust-lang/rust#107620 (comment), rechecked failing crates in rust-lang/rust#107620 (comment)) ... and given that this PR: * Fixes #104601 * Fixes #107557 * Fixes #109464 * Allows us to remove a `DefiningAnchor::Bubble` from codegen (75a8f681837c70051e0200a14f58ae07dbe58e66) I'm inclined to give this another shot at landing this. Best case, it just works -- worst case, we get more examples to study how we need to improve the compiler to make this work. r? types
Code
I tried this code:
risingwavelabs/risingwave#5119
Sorry for not having a very good small reproduce of this issue. Basically we found our code will stuck when compiling after upgrading compiler to nightly-2022-10-16. I suspect this is related to a not-so-recent change of trait normalization -> #102651.
I modified
normalize.rs
in the Rust compiler on the latest master and find that some part of code is producing a large number of obligations. The highest one hits 4052 obligations.Digging it into a little bit, we found that the root cause is the
OrderedMergeIteratorInner
+UnorderedMergeIteratorInner
, which produces ~50 obligations. And after some static dispatching with generic parameters, it exploded to 4052. For example, when we define something likeIteratorUnion<IteratorType1, IteratorType2, IteratorType3, IteratorType4>
, we will find about 4x50 obligations at the call site. If the code is wrapped in yet another enum that specializes some generics to two concrete types, it doubles again.Anyway, one interesting observation that can workaround the bug is here:
https://github.com/risingwavelabs/risingwave/blob/a0b2a119d6b5843bde33f9937b0f1bed3e80e0c8/src/storage/src/hummock/iterator/merge_inner.rs#L306-L308
https://github.com/risingwavelabs/risingwave/blob/a0b2a119d6b5843bde33f9937b0f1bed3e80e0c8/src/storage/src/hummock/iterator/merge_inner.rs#L265-L267
Once we erase the concrete future type and replace it with BoxFuture, the compiler no longer stuck and there won't be a large number of obligations.
Interestingly, any of the below changes will cause obligations to increase again:
moving the function call into the match will increase number of obligations
the original code before bumping toolchain
When the compiler stuck, it is resolving obligations like follows, which might take ~1sec at each call site that awaits a future including a MergeIterator.
Version it worked on
nightly-07-29
Version with regression
Backtrace
Backtrace
The text was updated successfully, but these errors were encountered: