Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix first assignment to _min/max_value in Curve bypassing min < max check #78931

Closed
wants to merge 1 commit into from

Conversation

anvilfolk
Copy link
Contributor

@anvilfolk anvilfolk commented Jul 2, 2023

Currently, the very first time one assigns to either _min_value or _max_value in Curve, it bypasses the min < max condition:

image

This PR implements "bulk load" for curve limits by adding a "fake" property called _limits, which loads an array containing all the values. These are assigned to the proper variables without checking for min < value, as it should be during deserialization. The min < max condition is still ensured by the set methods for the min/max values, so that we always serialize min < max values.

This solution was taken from #29959 (comment), and extracted from #67857 as a first step towards reviving it. It also sets the stage for other limits to be added (such as the domain ones) without having to replicate the current, buggy solution.

Long-winded explanation of the problem to serve as reference in the future:

The reason for the current behavior is that the min < max check is disabled the first time each property is written to. The idea is that this happens during deserialization, so that the checks don't conflict with the default values of 0 and 1. Suppose serialized values are min=5, max=10. When deserializing min first, and trying to set it, the condition that min < max(default) is 5 < 1, which fails.

However, if the Curve has default min/max values, those properties aren't serialized in the first place, and, therefore no first time assignment happens deserialization. This makes it so the first time the user attempts to assign to these limits, the checks are not guaranteed.

Downsides of approach

GDScript recognizes the properties tagged as PROPERTY_USAGE_INTERNAL & related methods, which perhaps it shouldn't. In this case, it is unintended since these properties are "fake" and mean to be used just for de/serialization. On the other hand, there might be some GDScript code out there that makes use of internal properties. I can imagine some addons, through GDExtension or GDScript, would do it. See #25651 for discussion on whether PROPERTY_USAGE_INTERNAL properties should be exposed to GDScript.

Warning: This contains a minor compatibility break in that _min/max_value are no longer serialized. However, they are still deserialized because the resource loaders don't check that the property they are reading must have the PROPERTY_USAGE_STORAGE flag.

@Chaosus Chaosus added this to the 4.2 milestone Jul 2, 2023
@MewPurPur
Copy link
Contributor

MewPurPur commented Jul 2, 2023

Can't we just do set_limits like...

  1. Set min to -INF
  2. Set max to the proper value, uninterupted by min as it's -INF
  3. Set min to the proper value

And not have extra variables for this?

@akien-mga
Copy link
Member

Tangentially related: #72250

@anvilfolk
Copy link
Contributor Author

@MewPurPur your logic is correct, but the actual issue isn't so much how to avoid the check, but instead knowing when to avoid it! The current implementation uses extra variables to disable checks the first time a variable is assigned to, which is often (but not always, as in this bug) during deserialization.

This PR's solution is exclusively used during de/serialization, so it addresses the when to use the checks perfectly. If we do move towards having curves with arbitrary domains (the other linked PR), it also scales really well to adding _min/max_domain with _min_domain < _max_domain to this :)

@anvilfolk
Copy link
Contributor Author

anvilfolk commented Jul 2, 2023

Tangentially related: #72250

Actually looks to be the exact same issue. I'll see if I can't replicate this solution there as well. Probably worth standardizing one way or another.

The downside of this approach is that somehow GDScript recognizes the properties & related methods, which perhaps it shouldn't, since they are meant to be PROPERTY_USAGE_NO_EDITOR | PROPERTY_USAGE_INTERNAL. But there might be some GDScript code out there that makes use of them, so we should figure out a solution. See #25651 for discussion.

@YuriSizov
Copy link
Contributor

I think this should be solved, somehow, with a new property hint. That hint would take care of not setting values in a wrong order or not setting them one by one. It's like having linked properties in a sense.

The current implementation technically breaks compatibility for serialized data.

@anvilfolk
Copy link
Contributor Author

anvilfolk commented Oct 30, 2023

Thanks for looking at this! :)

It does break compatibility (it's forward-compatible but not backwards-compatible), but not in a way that affects the curve data. These limits are used exclusively for drawing the curve in the inspector, so it's not necessarily a huge deal. Users could manually change domain values to be able to see the whole curve. We could also update this PR to enforce the domain (when not found) to include min and max points in the curve.

I do think it'd be lovely if the core had a way to handle mutually-dependent properties though!

I'm not immediately seeing a solution with property hints though. A load order hint wouldn't work (e.g. always load min before max): in this case, the order in which the max or the min needs to be loaded depends on whether it's above or below the default values.

Is there a way to bypass setters when loading, instead immediately assigning to a property? That could be the way to go, perhaps.

@YuriSizov
Copy link
Contributor

I'm not immediately seeing a solution with property hints though. A load order hint wouldn't work (e.g. always load min before max): in this case, the order in which the max or the min needs to be loaded depends on whether it's above or below the default values.

The hint could possibly act as a delay for setting values until after all of the values are parsed, then calling a special setter to set them together. As an example, at least.

@ttencate
Copy link
Contributor

ttencate commented Dec 22, 2023

Bypassing setters might be hard, because ClassDB doesn't support reflection on values directly, as far as I know. It only has the name of a property, and function pointers to its getter and setter.

What might work instead is if Resource had a new pair of methods:

private:
    virtual void _before_load() {}
    virtual void _after_load() {}

These are called by ResourceFormatText and ResourceFormatBinary before and after all the setters are called, respectively. These classes would need to be declared friends of Resource, or the methods would need to be public instead of private.

Resources with complex interdependencies between their properties can then do whatever they like in these callbacks, e.g. set a flag to temporarily suppress validity checking in setters.

This particular case doesn't even need a flag though. In _before_load(), min and max could be set to -INF and INF, and in _after_load(), if either of them is still infinity, it's set to its default value, 0 or 1.

@anvilfolk
Copy link
Contributor Author

My initial thought was to create a new macro called ADD_PROPERTY_OVERRIDE_LOAD() or something like that which further specified two setters to be used only when loading a resource. I haven't quite been able to figure out how many changes in core this would entail, but feels more in line with how other types of properties are added?

I feel like I would prefer this to a more stateful solution like the one you suggested (which can introduce threading issues, though I don't think that would happen in this particular case), even though what you suggest is perhaps less intrusive?

@ttencate
Copy link
Contributor

What kind of threading issues are you thinking of? No other references exist to the resource yet, since it's newly being loaded.

Adding new "properties of properties" is more heavyweight, because every property in ClassDB will need to have fields for it, even if they are rarely used. I'm not sure it's an actual problem, but the benefit needs to be higher to outweigh the cost. PropertyInfo has a comment stating something like "talk to Juan before adding more fields here" :)

Also, wouldn't you want to validate the properties after they've all been loaded? You'd need some kind of "loading done" callback for that anyway.

Could someone from the core dev team maybe comment? Maybe there are other cases in the engine that could make use of this API and help inform its design?

@Rindbee
Copy link
Contributor

Rindbee commented Jan 27, 2024

Maybe we can use ADD_LINKED_PROPERTY to indicate their relationship and their setting priority (needs to be implemented, I'm not sure if it's worth it).

In this way, when setting multiple properties (deserialization), all linked properties will be reset according to the setting priority.

@anvilfolk
Copy link
Contributor Author

anvilfolk commented Feb 22, 2024

Some updates on this! I love documenting stuff so bear with me or skip to TL;DR :)

I looked into the approach suggested by @ttencate with pre-load and post-load callbacks.

A post-load hook _finish_loading() can be called from ResourceLoader::_load(), which is agnostic to the specific loader being used. Now the problem is how a Resource knows it's in loading mode. There are two options:

  1. Resource, or Curve more specifically, could have a member _is_loading that is initialized to true. But then every time a Resource/Curve is created manually, the code must also manually call _finish_loading() . This feels like a sure way to get bugs, exactly like in Fix Curve min/max limits being bypassed on first assignment. #88224
  2. Use a pre-load function _start_loading() as suggested. However, there's no loader-agnostic way to call this function, since the loaders themselves instantiate the resources from ClassDB, and that's when you need to call the hook. Thus, if we want this at the Resource level, then every single loader would need to call the preload hook at the right time, or we would get inconsistent behavior across different resources, some supporting the preload/postload hooks and some not ❌
  3. Perhaps it is possible to create a loader that's specific to resources that need these functions called without too much code duplication? ❓

TL;DR all other suggested approaches have pitfalls that would require remembering to call specific functions at specific places in the codebase, which is a sure way to create bugs or inconsistent behavior. I'm currently looking into creating a loader specific to Resources that need these callbacks.

@ttencate
Copy link
Contributor

ttencate commented Feb 22, 2024

Been thinking about this too. There are only two resource loaders in the engine, right? And I don't think writing a custom loader in a module or GDExtension is all that common. So option 2 is not so bad imo. Not perfect, but very simple and non-invasive.

We could wrap the creation of the resource in a helper function on the abstract ResourceLoader base class, which would do the ClassDB thing and call the pre-load hook. That makes it clearer that it should be done for any type of resource loader.

@anvilfolk
Copy link
Contributor Author

anvilfolk commented Feb 22, 2024

So unfortunately there are 18 subclasses of ResourceFormatLoader that I can find, including ResourceFormatLoaderText and ResourceFormatLoaderBinary, but also some for JSON, crypto stuff, images, translations, scripts, textures, etc. There are many ways in which they work: some of them load directly in the load() function, some of them offload the loading work to another class (e.g. ResourceFormatLoaderText offloads to ResourceLoaderText), some use ClassDB::instantiate(), some directly memnew(), or Ref<>().instantiate(), etc.

So I don't believe we want the load hooks to exist at the Resource level, because that means we'd need to change every loader in non-trivial ways to accommodate functionality only one or two resources need.

I've been digging into ResourceLoaderText::load(). It uses ClassDB to instantiate both subresources as well as the main resource itself. I haven't looked very deeply into ResourceLoaderBinary::load(), but I only see a ClassDB::instantiate() once there.

I'm going to try to make a subclass of Resource called LoadHookResource that can be used by resources that need this functionality. This should allow maintainers to add support for these hooks as needed in different loaders, and we could start with just the two you suggested. I'm going to try to update #88224 with this approach.

@ttencate
Copy link
Contributor

I hope that only ResourceLoaderText and ResourceLoaderBinary are relevant. Those are the ones that work by dynamically calling setters, for which it isn't known ahead of time which setters will be called. Others, such as ResourceFormatLoaderGDScript or ResourceFormatLoaderCompressedTexture2D, have more knowledge of the type of resource they're loading, and can collaborate with that type accordingly.

@KoBeWi KoBeWi modified the milestones: 4.3, 4.4 Aug 3, 2024
@anvilfolk anvilfolk requested a review from a team as a code owner September 29, 2024 00:49
@anvilfolk
Copy link
Contributor Author

Rebased.

I do think this is the only sensible solution, and it's on the path towards getting #67857 revived, which I'm interested in doing.

There's a deeper discussion of this in #88224, but all other alternatives seem to require nontrivial changes to ClassDB, properties or all loaders. This general approach should also work for any Object that needs to be deserialized, including Range which appears to have the same problem despite being a Node and not a Resource.

⚠️ this is forward-compatible but not backwards-compatible: min_value and max_value will be lost when going back to previous version ⚠️ However, these are just visual aids and do not affect behavior of curves, so it's very minor ⚠️

@anvilfolk anvilfolk closed this Nov 10, 2024
@anvilfolk
Copy link
Contributor Author

Closed as this is included in #67857.

@AThousandShips AThousandShips removed this from the 4.4 milestone Nov 10, 2024
@anvilfolk anvilfolk deleted the curve-limits branch January 4, 2025 18:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants