-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Vendor and incorporate a minimal edit distance algorithm for diffing lists #2295
Comments
…ge (#2405) This PR adds a new algorithm for calculating the detailed diff which acts on the pulumi property values for the SDKv2 bridge, comparing the planned state returned by `Diff` to the prior state. This is flagged under the existing `DiffEqualDecisionOverride` feature flag. The results look very promising so far - all the detailed diff integration tests pass and the issues previously reported are almost all fixed by this. ## Why ## The current detailed diff acts on the `InstanceDiff` structure which is internal to the plugin-sdk. This has a few shortcomings: - TF does not actually refer to this for the detailed diff, so it might point to diffs which are not present in TF. - It refers to TF attribute paths, which are tricky to translate back in some cases. - It does not compare the planned state with the prior state but compares the news vs olds - this misses properties added by TF planning. ## Implementation ## The new algorithm is under `pkg/tfbridge/detailed_diff.go` and used in `pkg/tfbridge/provider.go:Diff` only for the SDKv2 and only if the `DiffEqualDecisionOverride` is enabled. The main entrypoint is `makePulumiDetailedDiffV2` - which in turn calls `makePropDiff` on each property. That branches on the type of the property and we have a separate function responsible for calculating the detailed diff for each property type. There's a few interesting bits here: - We always walk the full tree even when comparing against a nil property and simplify the diff after in `simplifyDiff`. This is in order to get replaces correct. More on that later. - When returning the diff to the engine we only return the simplest possible diff which explains the changes. So instead of `prop: Update, prop.subprop: Add`, we only return `prop.subprop: Add`. This seems to work much better in the engine and goes around some weird behaviour in the detailed diff display (see #2234 and #2400). Moreover, the first can be inferred from the second, so there is no reason for the bridge to return the full tree if only a leaf was changed. - We can not correctly identify diffs between nil and empty collections because of how the TF SDKv2 works without additional work. This is studied in `TestNilVsEmptyListProperty` and `TestNilVsEmptyMapProperty` in `pkg/cross-tests/diff_cross_test.go`. This is probably fine for now and a full fix is not possible. We can make a partial fix for non-computed properties by inspecting the pulumi inputs, before the plan. - There's a bit of an edge case with Unknowns and Replaces - we might not have enough information to tell the user they'll get a replace because the property which causes the replaces is nested inside an unknown. There's not much to do here, except to choose which side to err on. The algorithm currently does not say there's a replace. ### On Replaces ### We do not short-circuit detailed diffs when comparing non-nil properties against nil ones. The reason for that is that a replace might be triggered by a `ForceNew` inside a nested property of a non-`ForceNew` property. We instead always walk the full tree even when comparing against a nil property. We then later do a simplification step for the detailed diff in `simplifyDiff` in order to reduce the diff to what the user expects to see. For example: This is a list of objects with two properties, one of which is `ForceNew` ``` schema = { "list_prop": { Type: List, Elem: { "prop": String "force_new_prop": StringWithForceNew } } } ``` We are diffing an unspecified list against a list with a single element ``` olds = {} news = { "list_prop": [ { "prop": "val", "force_new_prop" : "val" } ] ``` The user expects to see: ``` + list_prop ``` or because of how collections work in TF SDKv2 (see #2233) ``` + list_prop[0] ``` An element was added to the list. When calculating the detailed diff we can short-circuit the diff when comparing the two lists, as we can see they have different lengths. This would identify the correct element to be displayed to the user as the reason for the diff but would fail to identify the diff as a replace, since we never saw the `ForceNew` on the nested property `force_new_prop` of the list elements. That is why, instead of short-circuiting the diff, we walk the full tree down and compare every property against a nil if it is not specified on the other side. We then to a simplification pass over the detailed diff, which respects any replaces triggered by nested properties and bubbles these up. There is a full case study of the TF behaviour around replaces in `pkg/cross-tests/diff_cross_test.go` `TestAttributeCollectionForceNew`, `TestBlockCollectionForceNew`, `TestBlockCollectionElementForceNew`. ## Testing ## Unit tests for the detailed diff algorithm in `pkg/tfbridge/detailed_diff_test.go` - this tries to cover all meaningful permutations of schemas: - `TestBasicDetailedDiff` tests all the meaningful pairs between nil values, empty values, non-empty values and computed for each TF type. - `TestDetailedDiffObject`, `TestDetailedDiffList`, `TestDetailedDiffMap`, `TestDetailedDiffSet` covers the cases not covered above for object and collection types. - `TestDetailedDiffTFForceNewPlain`, `TestDetailedDiffTFForceNewAttributeCollection`, `TestDetailedDiffTFForceNewBlockCollection`, `TestDetailedDiffTFForceNewElemBlockCollection`, `TestDetailedDiffTFForceNewObject` cover `ForceNew` behaviour in all TF types. - `TestDetailedDiffPulumiSchemaOverride` covers pulumi schema overrides Integration tests in `pkg/tests/schema_pulumi_test.go`, mostly `TestDetailedDiffPlainTypes` and `TestUnknownBlocks`. Note that most of the edge cases and issues previously discovered here are resolved by this PR. ## Follow-up Work ## Not done but will follow-up in separate PRs: - Non-trivial set diffing - sets are currently diffed the same as lists, which has all the previous issues with set diffs. #2200 - Non-trivial list diffing - we can do something like #2295 here. Note that we still need to investigate how this interacts with ForceNew and how TF preserves or does not preserve list element identity. We likely need to respect that in order not to have confusing unexplained replaces caused by list element changes. ## Related Issues ## fixes: - fixes #2294 - fixes #2296 - fixes #1504 - fixes #1895 - fixes #2141 - fixes #2235 - fixes #2325 - fixes #2400 - fixes #2234 - fixes #2427 does not fix: - #2399 - we must either fix the saved state to not contain redundant nils or fix the display logic in the engine to ignore these. - #2233 - This works the same as TF and seems to be a limitation of the SDKv2.
I believe we first need to exhaustively test the TF behaviour on lists here, as we need to match it. The reason for that is replaces - we should trigger a diff against an element only if TF does as that informs the replace/no replace decision. We can in turn use replaces to reverse-engineer the TF behaviour |
@mjeffryes @iwahbe @t0yv0 should we prioritize this before PF accurate previews? I believe without this we could be showing wrong replace decisions in the detailed diff in certain cases |
We already have cross-tests for diff so it should be somewhat straight-forward to test list behaviour. |
It's on the bubble, I think PF support is slightly more important. |
Having some more thoughts here, this might be a little finicky to get exactly right. Vanilla minimal edit distance works well on exact equality when items are removed/added, but when items are deeply modified in-place it will mis-classify this change into a removed/added where an ideal diff would show an in-place deep update. Perhaps some heuristics here could be admissible. |
Did some digging in opentofu and found their code responsible for diffing lists: https://github.com/opentofu/opentofu/blob/cb2e9119aa75eeb8e1fa175e2b7a205a4fef129f/internal/command/jsonformat/collections/slice.go#L39 I believe we might need to match that in order to present the correct diff in the presence of replacements. For example:
We have a schema with a list of objects. One of the properties of the nested object is If we have the following change:
It matters if we present this as: (Current Pulumi behaviour)
or
In the first case that would be a replacement and the second one wouldn't. We can use this technique to easily test and compare this behaviour. I'll write up some tests around this. |
They use this They then mark all the values outside of this LCS which are in Additionally, we already vendor the module this function is in, we can easily vendor this too. |
Seems like I was incorrect about my assumption about the Terraform behaviour. They only use the See https://github.com/pulumi/pulumi-terraform-bridge/blob/413419b267d84aa48ff87144ea954440741ba72b/pkg/tests/diff_test/testdata/TestSDKv2DetailedDiffList/list_attribute/long_list_added_front.golden for attributes Since blocks are unaffected, we can choose to present the diff in another way here instead of matching TF. This should never lead to incorrectly explained replacements. |
There's two alternative implementations here:
The first implementation would match exactly what TF does and the second one should handle more complicated cases better. |
I'd like to understand a bit more detail on what happens to blocks in TF and whether we can do any better. Thanks. |
Build up on #2294 by improving array diffs to show minimal changes. This implies using a minimum edit distance over PropertyValue[] sequences. Consider alternatives here in either coding from scratch or finding a suitable Go library to vendor. Earlier I could not find a suitable generic Go implementation, although there are very good implementations for diffing text specifically, so wrote my own, that could be one candidate to code-review and vendor: https://github.com/t0yv0/godifft
Resolves #2239
The text was updated successfully, but these errors were encountered: