Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement set detailedDiff without protocol changes #2200

Closed
t0yv0 opened this issue Jul 18, 2024 · 1 comment · Fixed by #2451
Closed

Implement set detailedDiff without protocol changes #2200

t0yv0 opened this issue Jul 18, 2024 · 1 comment · Fixed by #2451
Assignees
Labels
kind/enhancement Improvements or new features resolution/fixed This issue was fixed

Comments

@t0yv0
Copy link
Member

t0yv0 commented Jul 18, 2024

Hello!

  • Vote on this issue by adding a 👍 reaction
  • If you want to implement this feature, comment to let us know (we'll work with you on design, scheduling, etc.)

Issue details

This proposal outlines an idea for how set element difference presentation in detailed diff can be improved without making any changes to the protocol buffer layer between Pulumi CLI and the providers.

A quick recap from the protobuf declarations:

DiffRequest {
    string id = 1;                           // the ID of the resource to diff.
    string urn = 2;                          // the Pulumi URN for this resource.
    google.protobuf.Struct olds = 3;         // the old output values of resource to diff.
    google.protobuf.Struct news = 4;         // the new input values of resource to diff.
    repeated string ignoreChanges = 5;       // a set of property paths that should be treated as unchanged.
    google.protobuf.Struct old_inputs = 6;   // the old input values of the resource to diff.
}

DiffResponse {
    map<string, PropertyDiff> detailedDiff = 6; // a detailed diff appropriate for display.
    bool hasDetailedDiff = 7; // true if this response contains a detailed diff.
}

message PropertyDiff {
    Kind kind = 1; // The kind of diff associated with this property.
    bool inputDiff = 2; // The difference is between old and new inputs, not old and new state.

    enum Kind {
        ADD = 0;            // this property was added
        ADD_REPLACE = 1;    // this property was added, and this change requires a replace
        DELETE = 2;         // this property was removed
        DELETE_REPLACE = 3; // this property was removed, and this change requires a replace
        UPDATE = 4;         // this property's value was changed
        UPDATE_REPLACE = 5; // this property's value was changed, and this change requires a replace
    }
}

The difficulty with set elements is how to refer to set elements correctly in the key of map<string, PropertyDiff> detailedDiff = 6; property paths, where Pulumi CLI treats these as numeric array paths, under reordering.

The proposal is as follows.

For Kind={ADD,ADD_REPLACE}, diffs should refer to the indices relative to news. For example:

olds = set(["A", "B", "C"])
news = set(["A", "B", "X", "C"]) # "X" added at index 2
detailedDiff = {"setProp[2]": {"kind": "ADD"}}

For 'Kind={DELETE,DELETE_REPLACE}`, diffs should refer to the indices relative to olds or oldInputs. For example:

olds = set(["A", "B", "C"])  # remove "B" at index 1
news = set(["A", "C"])
detailedDiff = {"setProp[1]": {"kind": "DELETE"}}

As to 'Kind={UPDATE,UPDATE_REPLACE}` diffs, these should never be emitted for set elements.

#2198 outlines some background on Set support in TF. It seems difficult to construct examples where fully known set elements would be equated and updated in place. This would require the elements to return the same identity or compare as Equals, at which point any nested changes probably should be ignored. Hypothesize that this case does not arise in practice.

If any of the config set elements contain unkowns, such elements have undecidable identity and the change should fall under Kind={ADD,ADD_REPLACE} case.

Affected area/feature

Confused index corner-case

As specified, it may happen that the diff will want to clobber the same index for adding and replacing.

olds = set(["A", "B", "C"])  # remove "B" at index 1
news = set(["A", "X", "C"]) # insert "X" at index 1

One possibility is to ignore the deleteDiff and only return the addDiff:

detailedDiff = {"setProp[1]": {"kind": "ADD"}}

Another possibility is to disambiguate the paths in some form, if this still displays reasonably in the CLI.

detailedDiff = {
  "setProp[1].old": {"kind": "DELETE"},
  "setProp[1]": {"kind": "Add"}
}

A small change to Pulumi CLI may be suggested here to permit multiple entries to the detailedDiff map, perhaps:

detailedDiff = {
  "setProp[1]#1": {"kind": "DELETE"},
  "setProp[1]#2": {"kind": "Add"}
}

ignoreChanges

Per #1756 it is difficult to use ignore changes on set elements at the moment. This proposal will not alleviate this, but not make it particularly worse. With this proposal there will be no nested set element changes, all will be either element additions or deletions. An ignoreChanges implementation that copies state to config by path could continue working using set-unaware array indices prior to engaging PlanResourceChange and have the support remain at that level.

Analyzing OpenTofu code Set ignoreChanges is not particularly tested or handled specially, and the support is implemented by index-based copying prior to planning with the provider.

Fixes

This should fix #186

@t0yv0 t0yv0 added needs-triage Needs attention from the triage team kind/enhancement Improvements or new features labels Jul 18, 2024
@iwahbe iwahbe removed the needs-triage Needs attention from the triage team label Jul 19, 2024
VenelinMartinov added a commit that referenced this issue Oct 1, 2024
…ge (#2405)

This PR adds a new algorithm for calculating the detailed diff which
acts on the pulumi property values for the SDKv2 bridge, comparing the
planned state returned by `Diff` to the prior state. This is flagged
under the existing `DiffEqualDecisionOverride` feature flag. The results
look very promising so far - all the detailed diff integration tests
pass and the issues previously reported are almost all fixed by this.

## Why ##
The current detailed diff acts on the `InstanceDiff` structure which is
internal to the plugin-sdk. This has a few shortcomings:
- TF does not actually refer to this for the detailed diff, so it might
point to diffs which are not present in TF.
- It refers to TF attribute paths, which are tricky to translate back in
some cases.
- It does not compare the planned state with the prior state but
compares the news vs olds - this misses properties added by TF planning.

## Implementation ##
The new algorithm is under `pkg/tfbridge/detailed_diff.go` and used in
`pkg/tfbridge/provider.go:Diff` only for the SDKv2 and only if the
`DiffEqualDecisionOverride` is enabled.
The main entrypoint is `makePulumiDetailedDiffV2` - which in turn calls
`makePropDiff` on each property. That branches on the type of the
property and we have a separate function responsible for calculating the
detailed diff for each property type.
There's a few interesting bits here:
- We always walk the full tree even when comparing against a nil
property and simplify the diff after in `simplifyDiff`. This is in order
to get replaces correct. More on that later.
- When returning the diff to the engine we only return the simplest
possible diff which explains the changes. So instead of `prop: Update,
prop.subprop: Add`, we only return `prop.subprop: Add`. This seems to
work much better in the engine and goes around some weird behaviour in
the detailed diff display (see
#2234 and
#2400).
Moreover, the first can be inferred from the second, so there is no
reason for the bridge to return the full tree if only a leaf was
changed.
- We can not correctly identify diffs between nil and empty collections
because of how the TF SDKv2 works without additional work. This is
studied in `TestNilVsEmptyListProperty` and `TestNilVsEmptyMapProperty`
in `pkg/cross-tests/diff_cross_test.go`. This is probably fine for now
and a full fix is not possible. We can make a partial fix for
non-computed properties by inspecting the pulumi inputs, before the
plan.
- There's a bit of an edge case with Unknowns and Replaces - we might
not have enough information to tell the user they'll get a replace
because the property which causes the replaces is nested inside an
unknown. There's not much to do here, except to choose which side to err
on. The algorithm currently does not say there's a replace.


### On Replaces ###
We do not short-circuit detailed diffs when comparing non-nil properties
against nil ones. The reason for that is that a replace might be
triggered by a `ForceNew` inside a nested property of a non-`ForceNew`
property. We instead always walk the full tree even when comparing
against a nil property. We then later do a simplification step for the
detailed diff in `simplifyDiff` in order to reduce the diff to what the
user expects to see.

For example:
This is a list of objects with two properties, one of which is
`ForceNew`
```
schema = {
  "list_prop": {
     Type: List,
     Elem: {
         "prop": String
         "force_new_prop": StringWithForceNew
     }
  }
}
```
We are diffing an unspecified list against a list with a single element
```
olds = {}
news = {
"list_prop": [
   {
    "prop": "val",
    "force_new_prop" : "val"
  }
]
```


The user expects to see:
```
+ list_prop
```
or because of how collections work in TF SDKv2 (see
#2233)
```
+ list_prop[0]
```
An element was added to the list. When calculating the detailed diff we
can short-circuit the diff when comparing the two lists, as we can see
they have different lengths. This would identify the correct element to
be displayed to the user as the reason for the diff but would fail to
identify the diff as a replace, since we never saw the `ForceNew` on the
nested property `force_new_prop` of the list elements.

That is why, instead of short-circuiting the diff, we walk the full tree
down and compare every property against a nil if it is not specified on
the other side. We then to a simplification pass over the detailed diff,
which respects any replaces triggered by nested properties and bubbles
these up.

There is a full case study of the TF behaviour around replaces in
`pkg/cross-tests/diff_cross_test.go` `TestAttributeCollectionForceNew`,
`TestBlockCollectionForceNew`, `TestBlockCollectionElementForceNew`.

## Testing ##
Unit tests for the detailed diff algorithm in
`pkg/tfbridge/detailed_diff_test.go` - this tries to cover all
meaningful permutations of schemas:
- `TestBasicDetailedDiff` tests all the meaningful pairs between nil
values, empty values, non-empty values and computed for each TF type.
- `TestDetailedDiffObject`, `TestDetailedDiffList`,
`TestDetailedDiffMap`, `TestDetailedDiffSet` covers the cases not
covered above for object and collection types.
- `TestDetailedDiffTFForceNewPlain`,
`TestDetailedDiffTFForceNewAttributeCollection`,
`TestDetailedDiffTFForceNewBlockCollection`,
`TestDetailedDiffTFForceNewElemBlockCollection`,
`TestDetailedDiffTFForceNewObject` cover `ForceNew` behaviour in all TF
types.
- `TestDetailedDiffPulumiSchemaOverride` covers pulumi schema overrides

Integration tests in `pkg/tests/schema_pulumi_test.go`, mostly
`TestDetailedDiffPlainTypes` and `TestUnknownBlocks`. Note that most of
the edge cases and issues previously discovered here are resolved by
this PR.

## Follow-up Work ##
Not done but will follow-up in separate PRs:
- Non-trivial set diffing - sets are currently diffed the same as lists,
which has all the previous issues with set diffs.
#2200
- Non-trivial list diffing - we can do something like
#2295 here. Note
that we still need to investigate how this interacts with ForceNew and
how TF preserves or does not preserve list element identity. We likely
need to respect that in order not to have confusing unexplained replaces
caused by list element changes.

## Related Issues ##

fixes:
- fixes #2294
- fixes #2296
- fixes #1504
- fixes #1895
- fixes #2141
- fixes #2235
- fixes #2325
- fixes #2400
- fixes #2234
- fixes #2427


does not fix:
- #2399 - we
must either fix the saved state to not contain redundant nils or fix the
display logic in the engine to ignore these.
- #2233 - This
works the same as TF and seems to be a limitation of the SDKv2.
VenelinMartinov added a commit that referenced this issue Oct 30, 2024
This change adds improved TF set handling to the detailed diff v2. The
main challenge here is that pulumi does not have native sets, so set
types are represented as lists.

### Diffing sets using the hash ###

To correctly find the diff of two sets we calculate the hash of each
element ourselves and do the diffs based on that. What makes this
somewhat non-trivial is that due to MaxItemsOne flattening we can't just
hash the pulumi `PropertyValue`s given to us by the engine. Instead we
use `makeSingleTerraformInput` to transform the values using the schema.
We then use the hashes of the elements in the set to calculate the
diffs. This allows us to correctly account for shuffling and duplicates,
matching the terraform behaviour of sets.

When returning the element indices back to the engine, we need to return
them in terms of oldState and newInputs because the engine does not have
access to the plannedState (see
#2281). To do
that we keep the newInputs and match plannedState elements to their
newInputs index by the set hash. Note that this is not possible if the
set element contains any computed properties - these can change during
the planning process, so we do not attempt to match and print a warning
about possibly inaccurate diff results instead.


### Unknowns in sets ###
Note that the terraform planning process does not allow a set to contain
any unknowns, because that breaks the hashing. Because of that plan
should always return an unknown for a set which contains any unknowns.
This accounts for cases where resolving the unknown can result in
duplicate elements.

Unknown elements in sets - the whole set becomes unknown in the plan, so
the algorithm no longer works. Currently we return an update for the
whole set to the engine and it does the diff on its side.

### Testing ###
This PR also includes tests for the set behaviour, both unit tests for
the detailed diff algorithm and integration tests using pulumi programs
for:
- Single element additions, updates and removals
- Shuffling, also with additions, updates and removals
- Multi-element additions, updates and removals
- Unknowns

### Issues ###

Builds on #2405

Stacked on #2515,
#2516,
#2496 and
#2497

fixes #2200
fixes #2300
fixes #1904
fixes #186
@pulumi-bot pulumi-bot added the resolution/fixed This issue was fixed label Oct 30, 2024
@pulumi-bot
Copy link
Contributor

This issue has been addressed in PR #2451 and shipped in release v3.94.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/enhancement Improvements or new features resolution/fixed This issue was fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants