Implement set detailedDiff without protocol changes #2200

t0yv0 · 2024-07-18T21:20:02Z

Hello!

Vote on this issue by adding a 👍 reaction
If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

This proposal outlines an idea for how set element difference presentation in detailed diff can be improved without making any changes to the protocol buffer layer between Pulumi CLI and the providers.

A quick recap from the protobuf declarations:

DiffRequest {
    string id = 1;                           // the ID of the resource to diff.
    string urn = 2;                          // the Pulumi URN for this resource.
    google.protobuf.Struct olds = 3;         // the old output values of resource to diff.
    google.protobuf.Struct news = 4;         // the new input values of resource to diff.
    repeated string ignoreChanges = 5;       // a set of property paths that should be treated as unchanged.
    google.protobuf.Struct old_inputs = 6;   // the old input values of the resource to diff.
}

DiffResponse {
    map<string, PropertyDiff> detailedDiff = 6; // a detailed diff appropriate for display.
    bool hasDetailedDiff = 7; // true if this response contains a detailed diff.
}

message PropertyDiff {
    Kind kind = 1; // The kind of diff associated with this property.
    bool inputDiff = 2; // The difference is between old and new inputs, not old and new state.

    enum Kind {
        ADD = 0;            // this property was added
        ADD_REPLACE = 1;    // this property was added, and this change requires a replace
        DELETE = 2;         // this property was removed
        DELETE_REPLACE = 3; // this property was removed, and this change requires a replace
        UPDATE = 4;         // this property's value was changed
        UPDATE_REPLACE = 5; // this property's value was changed, and this change requires a replace
    }
}

The difficulty with set elements is how to refer to set elements correctly in the key of map<string, PropertyDiff> detailedDiff = 6; property paths, where Pulumi CLI treats these as numeric array paths, under reordering.

The proposal is as follows.

For Kind={ADD,ADD_REPLACE}, diffs should refer to the indices relative to news. For example:

olds = set(["A", "B", "C"])
news = set(["A", "B", "X", "C"]) # "X" added at index 2
detailedDiff = {"setProp[2]": {"kind": "ADD"}}

For 'Kind={DELETE,DELETE_REPLACE}`, diffs should refer to the indices relative to olds or oldInputs. For example:

olds = set(["A", "B", "C"])  # remove "B" at index 1
news = set(["A", "C"])
detailedDiff = {"setProp[1]": {"kind": "DELETE"}}

As to 'Kind={UPDATE,UPDATE_REPLACE}` diffs, these should never be emitted for set elements.

#2198 outlines some background on Set support in TF. It seems difficult to construct examples where fully known set elements would be equated and updated in place. This would require the elements to return the same identity or compare as Equals, at which point any nested changes probably should be ignored. Hypothesize that this case does not arise in practice.

If any of the config set elements contain unkowns, such elements have undecidable identity and the change should fall under Kind={ADD,ADD_REPLACE} case.

Affected area/feature

Confused index corner-case

As specified, it may happen that the diff will want to clobber the same index for adding and replacing.

olds = set(["A", "B", "C"])  # remove "B" at index 1
news = set(["A", "X", "C"]) # insert "X" at index 1

One possibility is to ignore the deleteDiff and only return the addDiff:

detailedDiff = {"setProp[1]": {"kind": "ADD"}}

Another possibility is to disambiguate the paths in some form, if this still displays reasonably in the CLI.

detailedDiff = {
  "setProp[1].old": {"kind": "DELETE"},
  "setProp[1]": {"kind": "Add"}
}

A small change to Pulumi CLI may be suggested here to permit multiple entries to the detailedDiff map, perhaps:

detailedDiff = {
  "setProp[1]#1": {"kind": "DELETE"},
  "setProp[1]#2": {"kind": "Add"}
}

ignoreChanges

Per #1756 it is difficult to use ignore changes on set elements at the moment. This proposal will not alleviate this, but not make it particularly worse. With this proposal there will be no nested set element changes, all will be either element additions or deletions. An ignoreChanges implementation that copies state to config by path could continue working using set-unaware array indices prior to engaging PlanResourceChange and have the support remain at that level.

Analyzing OpenTofu code Set ignoreChanges is not particularly tested or handled specially, and the support is implemented by index-based copying prior to planning with the provider.

Fixes

This should fix #186

The text was updated successfully, but these errors were encountered:

…ge (#2405) This PR adds a new algorithm for calculating the detailed diff which acts on the pulumi property values for the SDKv2 bridge, comparing the planned state returned by `Diff` to the prior state. This is flagged under the existing `DiffEqualDecisionOverride` feature flag. The results look very promising so far - all the detailed diff integration tests pass and the issues previously reported are almost all fixed by this. ## Why ## The current detailed diff acts on the `InstanceDiff` structure which is internal to the plugin-sdk. This has a few shortcomings: - TF does not actually refer to this for the detailed diff, so it might point to diffs which are not present in TF. - It refers to TF attribute paths, which are tricky to translate back in some cases. - It does not compare the planned state with the prior state but compares the news vs olds - this misses properties added by TF planning. ## Implementation ## The new algorithm is under `pkg/tfbridge/detailed_diff.go` and used in `pkg/tfbridge/provider.go:Diff` only for the SDKv2 and only if the `DiffEqualDecisionOverride` is enabled. The main entrypoint is `makePulumiDetailedDiffV2` - which in turn calls `makePropDiff` on each property. That branches on the type of the property and we have a separate function responsible for calculating the detailed diff for each property type. There's a few interesting bits here: - We always walk the full tree even when comparing against a nil property and simplify the diff after in `simplifyDiff`. This is in order to get replaces correct. More on that later. - When returning the diff to the engine we only return the simplest possible diff which explains the changes. So instead of `prop: Update, prop.subprop: Add`, we only return `prop.subprop: Add`. This seems to work much better in the engine and goes around some weird behaviour in the detailed diff display (see #2234 and #2400). Moreover, the first can be inferred from the second, so there is no reason for the bridge to return the full tree if only a leaf was changed. - We can not correctly identify diffs between nil and empty collections because of how the TF SDKv2 works without additional work. This is studied in `TestNilVsEmptyListProperty` and `TestNilVsEmptyMapProperty` in `pkg/cross-tests/diff_cross_test.go`. This is probably fine for now and a full fix is not possible. We can make a partial fix for non-computed properties by inspecting the pulumi inputs, before the plan. - There's a bit of an edge case with Unknowns and Replaces - we might not have enough information to tell the user they'll get a replace because the property which causes the replaces is nested inside an unknown. There's not much to do here, except to choose which side to err on. The algorithm currently does not say there's a replace. ### On Replaces ### We do not short-circuit detailed diffs when comparing non-nil properties against nil ones. The reason for that is that a replace might be triggered by a `ForceNew` inside a nested property of a non-`ForceNew` property. We instead always walk the full tree even when comparing against a nil property. We then later do a simplification step for the detailed diff in `simplifyDiff` in order to reduce the diff to what the user expects to see. For example: This is a list of objects with two properties, one of which is `ForceNew` ``` schema = { "list_prop": { Type: List, Elem: { "prop": String "force_new_prop": StringWithForceNew } } } ``` We are diffing an unspecified list against a list with a single element ``` olds = {} news = { "list_prop": [ { "prop": "val", "force_new_prop" : "val" } ] ``` The user expects to see: ``` + list_prop ``` or because of how collections work in TF SDKv2 (see #2233) ``` + list_prop[0] ``` An element was added to the list. When calculating the detailed diff we can short-circuit the diff when comparing the two lists, as we can see they have different lengths. This would identify the correct element to be displayed to the user as the reason for the diff but would fail to identify the diff as a replace, since we never saw the `ForceNew` on the nested property `force_new_prop` of the list elements. That is why, instead of short-circuiting the diff, we walk the full tree down and compare every property against a nil if it is not specified on the other side. We then to a simplification pass over the detailed diff, which respects any replaces triggered by nested properties and bubbles these up. There is a full case study of the TF behaviour around replaces in `pkg/cross-tests/diff_cross_test.go` `TestAttributeCollectionForceNew`, `TestBlockCollectionForceNew`, `TestBlockCollectionElementForceNew`. ## Testing ## Unit tests for the detailed diff algorithm in `pkg/tfbridge/detailed_diff_test.go` - this tries to cover all meaningful permutations of schemas: - `TestBasicDetailedDiff` tests all the meaningful pairs between nil values, empty values, non-empty values and computed for each TF type. - `TestDetailedDiffObject`, `TestDetailedDiffList`, `TestDetailedDiffMap`, `TestDetailedDiffSet` covers the cases not covered above for object and collection types. - `TestDetailedDiffTFForceNewPlain`, `TestDetailedDiffTFForceNewAttributeCollection`, `TestDetailedDiffTFForceNewBlockCollection`, `TestDetailedDiffTFForceNewElemBlockCollection`, `TestDetailedDiffTFForceNewObject` cover `ForceNew` behaviour in all TF types. - `TestDetailedDiffPulumiSchemaOverride` covers pulumi schema overrides Integration tests in `pkg/tests/schema_pulumi_test.go`, mostly `TestDetailedDiffPlainTypes` and `TestUnknownBlocks`. Note that most of the edge cases and issues previously discovered here are resolved by this PR. ## Follow-up Work ## Not done but will follow-up in separate PRs: - Non-trivial set diffing - sets are currently diffed the same as lists, which has all the previous issues with set diffs. #2200 - Non-trivial list diffing - we can do something like #2295 here. Note that we still need to investigate how this interacts with ForceNew and how TF preserves or does not preserve list element identity. We likely need to respect that in order not to have confusing unexplained replaces caused by list element changes. ## Related Issues ## fixes: - fixes #2294 - fixes #2296 - fixes #1504 - fixes #1895 - fixes #2141 - fixes #2235 - fixes #2325 - fixes #2400 - fixes #2234 - fixes #2427 does not fix: - #2399 - we must either fix the saved state to not contain redundant nils or fix the display logic in the engine to ignore these. - #2233 - This works the same as TF and seems to be a limitation of the SDKv2.

This change adds improved TF set handling to the detailed diff v2. The main challenge here is that pulumi does not have native sets, so set types are represented as lists. ### Diffing sets using the hash ### To correctly find the diff of two sets we calculate the hash of each element ourselves and do the diffs based on that. What makes this somewhat non-trivial is that due to MaxItemsOne flattening we can't just hash the pulumi `PropertyValue`s given to us by the engine. Instead we use `makeSingleTerraformInput` to transform the values using the schema. We then use the hashes of the elements in the set to calculate the diffs. This allows us to correctly account for shuffling and duplicates, matching the terraform behaviour of sets. When returning the element indices back to the engine, we need to return them in terms of oldState and newInputs because the engine does not have access to the plannedState (see #2281). To do that we keep the newInputs and match plannedState elements to their newInputs index by the set hash. Note that this is not possible if the set element contains any computed properties - these can change during the planning process, so we do not attempt to match and print a warning about possibly inaccurate diff results instead. ### Unknowns in sets ### Note that the terraform planning process does not allow a set to contain any unknowns, because that breaks the hashing. Because of that plan should always return an unknown for a set which contains any unknowns. This accounts for cases where resolving the unknown can result in duplicate elements. Unknown elements in sets - the whole set becomes unknown in the plan, so the algorithm no longer works. Currently we return an update for the whole set to the engine and it does the diff on its side. ### Testing ### This PR also includes tests for the set behaviour, both unit tests for the detailed diff algorithm and integration tests using pulumi programs for: - Single element additions, updates and removals - Shuffling, also with additions, updates and removals - Multi-element additions, updates and removals - Unknowns ### Issues ### Builds on #2405 Stacked on #2515, #2516, #2496 and #2497 fixes #2200 fixes #2300 fixes #1904 fixes #186

pulumi-bot · 2024-10-31T13:23:52Z

This issue has been addressed in PR #2451 and shipped in release v3.94.0.

t0yv0 added needs-triage Needs attention from the triage team kind/enhancement Improvements or new features labels Jul 18, 2024

iwahbe removed the needs-triage Needs attention from the triage team label Jul 19, 2024

t0yv0 mentioned this issue Jul 23, 2024

Detailed diff for set element changes shows more additions/removals than present #2235

Closed

t0yv0 mentioned this issue Aug 9, 2024

Implement schema-aware diffing for detail diff #2294

Closed

VenelinMartinov mentioned this issue Sep 26, 2024

Schema-aware pulumi-level detailed diff calculation in the SDKv2 bridge #2405

Merged

VenelinMartinov mentioned this issue Oct 7, 2024

Improve set detailed diffs #2451

Merged

This was referenced Oct 15, 2024

Non-trivial set detailed diffs #2476

Closed

Arrays representing TF sets are saved in state in the wrong order #2478

Closed

VenelinMartinov closed this as completed in #2451 Oct 30, 2024

pulumi-bot assigned VenelinMartinov Oct 30, 2024

pulumi-bot added the resolution/fixed This issue was fixed label Oct 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement set detailedDiff without protocol changes #2200

Implement set detailedDiff without protocol changes #2200

t0yv0 commented Jul 18, 2024 •

edited

Loading

pulumi-bot commented Oct 31, 2024

Implement set detailedDiff without protocol changes #2200

Implement set detailedDiff without protocol changes #2200

Comments

t0yv0 commented Jul 18, 2024 • edited Loading

Hello!

Issue details

Affected area/feature

Confused index corner-case

ignoreChanges

Fixes

pulumi-bot commented Oct 31, 2024

t0yv0 commented Jul 18, 2024 •

edited

Loading