Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prevent stack overflow in two-way bindings. #17073

Merged
merged 2 commits into from
Sep 25, 2024

Conversation

grokys
Copy link
Member

@grokys grokys commented Sep 20, 2024

What does the pull request do?

Fixes a stack overflow when the target of a two-way binding alters the value of the binding during the change notification.

This is an alternative (simpler) fix than #16819

How was the solution implemented (if it's not obvious)?

When a change is detected on the target object during the target -> source part of syncing the binding, the value must be read from the target object instead of using the value in the event because the value may have changed again between the time the event was raised and when we get the notification.

Fixes #16746

The value must be read from the target object instead of using the value from the event because the value may have changed again between the time the event was raised and now: if that occurs in a two-way binding then we end up with a stack overflow.
@maxkatz6 maxkatz6 added bug backport-candidate-11.1.x Consider this PR for backporting to 11.1 branch labels Sep 20, 2024
@timunie timunie added customer-priority Issue reported by a customer with a support agreement. area-bindings labels Sep 20, 2024
@avaloniaui-bot
Copy link

You can test this PR using the following package version. 11.2.999-cibuild0051979-alpha. (feed url: https://nuget-feed-all.avaloniaui.net/v3/index.json) [PRBUILDID]

@busitech
Copy link

busitech commented Sep 21, 2024

After applying this patch, the obsolete value is still being sent out twice by AvaloniaObject lines 781-782. I could not think of any reasons this would be good to permit.

When the user code triggered by the first execution of line 777 changes the value for the second time (OnPropertyChanged), this same block of code executes again in its entirety while this stack frame sits still. The second pass has sent NPC events that are the first to be received, using the latest value (in e.NewValue) by the time execution moves off of the first execution of line 777. At that point, line 781 is running for the second time, as a result of the first change. The following NPC events are "out of sync", duplicative / redundant, and have been superseded by those already sent.

777         OnPropertyChangedCore(e);    <--- Avalonia property is changed again during this call

            if (isEffectiveValue)
            {
781             property.NotifyChanged(e);            <--- e.NewValue is obsolete
782             _propertyChanged?.Invoke(this, e);    <--- e.NewValue is obsolete
783             _inpcChanged?.Invoke(this, new PropertyChangedEventArgs(property.Name));
            }

Here is the test we wrote to detect this. An easy way to cause this test to be true is to simply add a MaskedTextBox to the UI and bind it to an empty string. This block will execute during initialization of the UI:

if (!EqualityComparer<T>.Equals(GetValue(property), e.NewValue.GetValueOrDefault()))
{
    Console.WriteLine("[AvaloniaObject] Value: {0} obsolete and already replaced by: {1}", e.NewValue.GetValueOrDefault(), GetValue(property));
    return; // value changed again in the event handler. e is now obsolete, why send it?
}

One of my thoughts is that any application containing logic that handles these "out of sync and out of sequence" (with respect to the most current value, and with respect to the order in which the new values were actually set) events sent by lines 781 and 782, and also relies on e.OldValue or e.NewValue in relation to each other, or in relation to the most current value of the property, will experience horrible bugs that are nearly impossible to explain.

Having the old value sent last and the most current value sent first at lines 781 and 782, with the old value seen first and the new value seen second in OnPropertyChanged, seems like a programmer's nightmare.

I am interested to know your thoughts about this.

@grokys
Copy link
Member Author

grokys commented Sep 25, 2024

@busitech I'm afraid this problem is just inherent to raising events with state in them. For example consider the following example:

var b = new Border();
b.PropertyChanged += Listener1;
b.PropertyChanged += Listener2;
b.Opacity = 2;

private void Listener1(object? sender, AvaloniaPropertyChangedEventArgs e)
{
    if (e.Property == Border.OpacityProperty)
    {
        if (e.GetNewValue<double>() == 2)
        {
            ((Border)sender!).Opacity = 3;
        }
    }
}

private void Listener2(object? sender, AvaloniaPropertyChangedEventArgs e)
{
    if (e.Property == Border.OpacityProperty)
    {
        System.Diagnostics.Debug.WriteLine("Opacity change to " + e.NewValue);
    }
}

This gives:

Opacity change to 3
Opacity change to 2

This sample only uses PropertyChanged and the out-of-order problem still occurs.

AvaloniaObject has 4 different types of notifications it sends when a property value changes:

  1. OnPropertyChanged
  2. PropertyChanged
  3. AvaloniaProperty.Changed
  4. INotifyPropertyChanged

But conceptually, these should all be thought of as the same event. For example 1, 3 and 4 could all be implemented in response to adding event handlers to the PropertyChanged event (2).

The above code snippet proves that the problem can occur within in a single event mechanism so we shouldn't be hacking a solution into the other mechanisms: that would just increase the likelihood of confusing things happening.

In this particular case, I think @MrJul is going to fix MaskedTextBox to try to avoid this problem in that control.

@MrJul MrJul added this pull request to the merge queue Sep 25, 2024
Merged via the queue into master with commit 6b78783 Sep 25, 2024
12 checks passed
@MrJul MrJul deleted the fixes/16746-binding-stackoverflow branch September 25, 2024 15:50
@busitech
Copy link

busitech commented Sep 26, 2024

Hi @grokys, thank you reading my comment, and reproducing the issue. I find it an interesting problem, and really good to be aware of, but not one we are necessarily stuck with. An event mechanism that queues dispatched events and sends them to listeners sequentially, where all listeners will have heard the current event before moving on to the next would not be prone to the issue, cf. Apple Grand Central Dispatch.

The problem is inherent to using the same thread to deliver events as was used to dispatch them, and doing so immediately. Put another way, the problem is inherent to the absence of a supervised event processing queue that is non-blocking and defers dispatch to the control of the supervisor.

A single-threaded, unsupervised design:

  • sends every event fired to (at least) the first listener immediately
  • allows that listener to block delivery to remaining listeners, and
  • dispatch a subsequent event (even the same event), thus imposing a higher priority (earlier in sequence).

If the new event is:

  • the same event (recursion, unusual), this can happen to any stack depth. After being released, the "hung" events are unwound in reverse order, LIFO (last in, first out), when what is generally expected is FIFO event sequence (first in, first out).
  • a different event, event sequence is still affected when event handlers fire other events conditionally, which when those are fired, will cause delay for all listeners of prior events, and immediate processing of newer events.

One more wrinkle is that the listener who has the opportunity to make the first interruption and fire additional events is simply the one who happens to receive the first dispatch. If dispatcher call sequence is tied to the order of subscription, even this can change over the lifespan of the application if the same listener were to be removed and added again. If the sequence of dispatch would ever change due to a framework change, or become indeterminate (random iteration), application functionality would change also.

I agree that the group of four notifications coming out of AvaloniaObject represent a single event. They should be atomic, such that either all occur, or none occur, but they should also be internally consistent, rather than split apart or interleaved by conflicting values. With four variants for one event, it seems to exacerbate the issue, but you're right, it's the same no matter how many types of events there are.

@busitech
Copy link

@grokys Do you see any merit to the third and final segment of our PR, which gives the choice to the UI control developer about when to propagate a value back to the source, and when to suppress the write-back, such as if updating the displayed text in a text box which should not be committed? Adding this parameter to AvaloniaObject.SetCurrentValue to allow this option caused the majority of LoC changes in our PR.

MaskedTextBox needs to be able to write an empty mask to the screen without changing the source, and unless I missed something, @MrJul will be faced with the inability to do so practically, as I was. The Text property binding will always fire when the mask is displayed, and the empty string or null originally bound to the control from the source will always be immediately replaced, without user input, as if the user was the cause.

grokys added a commit that referenced this pull request Oct 8, 2024
* Add failing test for #16746

* Always read the value from the target object.

The value must be read from the target object instead of using the value from the event because the value may have changed again between the time the event was raised and now: if that occurs in a two-way binding then we end up with a stack overflow.
@grokys grokys added backported-11.1.x and removed backport-candidate-11.1.x Consider this PR for backporting to 11.1 branch labels Oct 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-bindings backported-11.1.x bug customer-priority Issue reported by a customer with a support agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

stack overflow in binding system
6 participants