Skip to content

Conversation

@egil
Copy link
Member

@egil egil commented Jun 29, 2023

The second round of changes that should prevent stack overflow reported in #1064.

I will attempt to push a preview release directly from this branch so we can test it before going further.

PR meta checklist

  • Pull request is targeted at main branch for code
    or targeted at stable branch for documentation that is live on bunit.dev.
  • Pull request is linked to all related issues, if any.
  • I have read the CONTRIBUTING.md document.

Code PR specific checklist

  • My code follows the code style of this project and AspNetCore coding guidelines.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
  • I have updated the appropriate sub section in the CHANGELOG.md.
  • I have added, updated or removed tests to according to my changes.
    • All tests passed.

@egil egil requested a review from linkdotnet June 29, 2023 22:27
@egil egil linked an issue Jun 30, 2023 that may be closed by this pull request
@egil egil force-pushed the protecting-the-render-tree branch from f0e1dfe to e5ac99b Compare June 30, 2023 17:14
Copy link
Collaborator

@linkdotnet linkdotnet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I have mixed feelings about the PR. It adds quite some complexity and coupling to the code base. This is fine in theory, but I am not sure if we just tackle the symptom or the root cause of the stack overflow - I am leaning much toward the "symptom" site.

Don't get me wrong, I do think that there might be a bug in our code base causing this - given that the user sees no problem when running his code in the browser.

The circumstances of the bug make it awful hard to debug. Just my two cents

@egil
Copy link
Member Author

egil commented Jul 1, 2023

Overall I have mixed feelings about the PR. It adds quite some complexity and coupling to the code base. This is fine in theory, but I am not sure if we just tackle the symptom or the root cause of the stack overflow - I am leaning much toward the "symptom" site.

Don't get me wrong, I do think that there might be a bug in our code base causing this - given that the user sees no problem when running his code in the browser.

The circumstances of the bug make it awful hard to debug. Just my two cents

I'm pretty certain there is a bug. The stack traces that have been shared with us show the bug in two places, 1. when we traverse through the render tree frames of each rendered component from UpdateDisplay and 2. in RenderEvent when deciding if a rendered component or its children have changed during a render.

After fixing it in the first place, the second showed up, and the stack overflow visible from the stack trace pointed at the second position.

My assumption has always been that calls to GetCurrentRenderTreeFrames(componentId) would not return render frames that point to disposed components. I assumed they would be removed before UpdateDisplayAsync is called, but it does seem this is not the case. This is why the UpdateDisplayAsync.LoadChangesIntoRenderEvent is way more complex than previously, since it needs to check whether a component referenced in a parent component's render frames is disposed or not, before proceeding. After adding the checks, the stack overflows have stopped.

The code we had previously did something like this and that overflowed. I do play to ask the Blazor team to clarify my assumptions:

private void CanStackOverflow(int componentId)
{
  var frames = GetCurrentRenderTreeFrames(componentId);  
  for (var i = 0; i < frames.Count; i++)
  {
    ref var frame = ref frames.Array[i];
    if (frame.FrameType == RenderTreeFrameType.Component)
    {
      CanStackOverflow(frame.ComponentId);
    }
  }
}

I agree this is not the prettiest code and I am open to any and all improvements you can suggest. The requirement, at least for the v1 code, is that the RenderEvent type needs to be able to answer whether or not a RenderedCompnent has been disposed or needs to have its Markup re-created. And it should return the render tree frames for each component and child component associated with the RenderedComponent (used by Htmlizer) when the markup is recreated.

Pinged the Blazor team and asked for help understanding the render tree frames: dotnet/aspnetcore#49138

@egil
Copy link
Member Author

egil commented Jul 1, 2023

Btw. I want to rewrite a bunch of this logic in V2, suspect we can drop having RenderEvent entirely, so a lot of the complexity also comes out of being none breaking from an API point of view.

@egil
Copy link
Member Author

egil commented Jul 1, 2023

This is fine in theory, but I am not sure if we just tackle the symptom or the root cause of the stack overflow - I am leaning much toward the "symptom" site.

I am very open to alternatives. Any good idea what other things that could cause this?

@linkdotnet
Copy link
Collaborator

This is fine in theory, but I am not sure if we just tackle the symptom or the root cause of the stack overflow - I am leaning much toward the "symptom" site.

I am very open to alternatives. Any good idea what other things that could cause this?

I dug in the code yesterday for an hour - but I am not really smarter. It might be about shared state between test runs. Without any real and reproducible code, it is very hard to pinpoint the issue.
That said, I also have no strong idea what's going on and I am also fine if we merge the PR.

@egil
Copy link
Member Author

egil commented Jul 2, 2023

Yeah, and it was/is something that was sensitive to him enabling logging.

I am fairly certain that part of this is due to renders happening after the TestContext/TestRenderer's Dispose method has been called, which causes the render tree frames to enter into an invalid state.

We did have a test in the deterministic render branch related to disposing components that were causing issues too, as far as I remember, so there may be a fix in this branch that fixes that.

@linkdotnet
Copy link
Collaborator

linkdotnet commented Jul 2, 2023

Yeah, and it was/is something that was sensitive to him enabling logging.

I am fairly certain that part of this is due to renders happening after the TestContext/TestRenderer's Dispose method has been called, which causes the render tree frames to enter into an invalid state.

We did have a test in the deterministic render branch related to disposing components that were causing issues too, as far as I remember, so there may be a fix in this branch that fixes that.

Fair point - I did have the same picture where I removed all GetAwaiter().GetResult() calls in the renderer and related code pieces and probably something more.

What for sure changed is how we implemented DisposeComponents on this branch (using RemoveRootComponen or so). But I am pretty certain this is not the problematic function for the user.

EDIT: Hopefully :D

@egil
Copy link
Member Author

egil commented Jul 2, 2023

I do want to wait a bit and give the blazor team a chance to respond to my question. It may confirm my assumptions or force us to look for other solutions/causes.

@egil egil force-pushed the protecting-the-render-tree branch from c93e1c8 to 9c5e872 Compare July 3, 2023 08:01
@egil egil force-pushed the protecting-the-render-tree branch 3 times, most recently from b342aa2 to 02f5909 Compare July 11, 2023 16:20
@egil egil force-pushed the protecting-the-render-tree branch from 02f5909 to db411c3 Compare July 11, 2023 16:21
@egil
Copy link
Member Author

egil commented Jul 11, 2023

With Steve's feedback in mind, I still think this is a decent solution. Give it a look over again and let me know what your thoughts are.

@egil egil requested a review from linkdotnet July 11, 2023 16:22
@egil egil merged commit 449309e into main Jul 11, 2023
@egil egil deleted the protecting-the-render-tree branch July 11, 2023 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1.19.14 dotnet test crashes when running multiple tests

3 participants