helper/schema: Schema.DiffSuppressOnRefresh #882

apparentlymart · 2022-02-04T03:24:57Z

Unfortunately in the original design of DiffSuppressFunc we made it work only during planning, and so the normalization vs. drift classification it does isn't honored in any other situation.

Unilaterally enabling it for both refresh and apply would be too risky at this late stage where so many existing provider behaviors rely on these subtle details, but here we introduce a more surgical opt-in mechanism whereby specific attributes can be set to also check the Refresh result against DiffSuppressFunc, and so avoid reporting any normalization done by the remote system as if it were drift.

    DiffSuppressFunc: /* as before, unmodified */,
    DiffSuppressOnRefresh: true, // enables the new behavior

This behavior will be primarily useful for strings containing complex data serialized into some particular microsyntax which allows expressing the same information multiple ways. In particular, existing situations where we classify whitespace changes and other such immaterial changes in JSON values as normalization vs. drift would be a good candidate to enable this flag, so that we can get the same results during refreshing and thus avoid churning any downstream resources that refer to the
unexpectedly-updated result.

This deals with only the most visible problem previously discussed over in hashicorp/terraform-plugin-framework#70 -- creating value churn during the refresh phase -- because the risk of a more systematic change to the legacy SDK at this point is too high. Hopefully eventually the new framework will address this more completely (also handling it during the apply step, for example), but I intend this PR as a pragmatic way to help provider developers more easily resolve bug reports related to incorrect classification of normalization by relying on the DiffSuppressFunc implementations already written against this old SDK, so they can be addressed with only minimal schema changes and in particular without switching to the new framework.

My primary goal here is that this should cause absolutely no observable difference in result for any existing schema that doesn't set this new flag, and so the presence of this new code would have no effect on any existing provider unless they also opt in a particular attribute to it. I tried by best to code defensively to avoid e.g. introducing new panics in weird edge cases, but this legacy SDK does have a fair number of weird edge cases and so if you can spot one I didn't think of please let me know!

Although not exactly what was requested there, I think this closes #801 by providing a way to explicitly opt in to the requested behavior, even though it can't be the default behavior.

apparentlymart · 2022-02-04T03:31:13Z

This lint rule about not naming a variable new is catching cases where I'd followed some existing precedent elsewhere, so I'm going to leave it for now but I'd be happy to update my new additions here to use a different name if maintainers consider it more important to pass lint for new code than to be consistent with existing code.

detro · 2022-02-04T09:44:03Z

I have actually experienced this issue (#801) myself, with the same kind of resource: AWS policies.

So, a way for the provider to deal with this would be lovely.

detro

Aside from me asking questions that I then self-answered (and please, let me know if I'm missing something or some nuance), I think this looks good (thank you for the long description - that context helped me a lot).

I guess the only thing to deal with are the linter issues, but they should be very easy to address.

helper/schema/schema.go

detro · 2022-02-04T09:51:39Z

helper/schema/schema.go

+			continue // no schema? weird, but not our responsibility to handle
+		}
+		schema := schemaList[len(schemaList)-1]
+		if !schema.DiffSuppressOnRefresh || schema.DiffSuppressFunc == nil {


Question: why isn't the branching if DiffSuppressOnRefresh handled here instead of, say, at the beginning of this method OR before this method is even called?

Nevermind, I think I can answer my own question: this is to ensure this logic is applied selectively to sub-attributes of a schema. Even if most of it doesn't use it, if there is even just 1 attribute that does, we want to check it for each.

Yes, unfortunately I think we do need to visit everything here to make sure none of them are in need of this work.

An additional resource-level opt-in could potentially mitigate that but it seems like an annoying and unprecedented extra step, and in practice most (though certainly not all) resource types have only tens of attributes and so I don't expect this to be prohibitively expensive in the common case.

detro

LGTM

bflad

I apologize for the delayed review on my part since my attention has been elsewhere the last few days.

In general, I think this is likely the best approach to take here given the current compatibility promises we need to make, while offering the most flexibility. Creating a separate ResourceData "Set with DiffSuppressFunc" method or "Set with Options" method and having this as an available option is likely a non-starter given the level of effort and potential provider developer confusion.

Most of my comments here are more around making the functionality more debuggable and understandable for provider developers, who may not be familiar with certain terminology that the SDK has hidden away.

We should likely setup a separate terraform-provider-corner case that verifies this new functionality in the unlikely event there are missing nuances in the SDK logic that would prevent it from working with real world providers. Worst case, it could be used just to ensure we maintain backwards compatibility should we need to change this area of functionality in the future. That can be done outside this pull request though, since it involves a separate repository and a dance of merges to properly become a usable integration test.

helper/schema/schema.go

bflad · 2022-02-08T19:35:00Z

helper/schema/resource.go

@@ -638,6 +638,7 @@ func (r *Resource) RefreshWithoutUpgrade(
 		state = nil
 	}

+	schemaMap(r.Schema).handleDiffSuppressOnRefresh(s, state)


Aside: It may be beneficial to offer this as a ResourceData method on top of the schema field implementation, so provider developers can opt into setting this once per resource (or potentially across the provider, if we did some other implementation work). Not saying anything different needs to be done here now, unless we think that is something we want to consider upfront in terms of implementation and doing it outside of schemaMap.

I did originally consider making this a whole-resource-type-level settable, but decided to go more surgical so that it would be less risky to enable for a long-standing existing resource type implementation.

I'm not sure if I follow exactly what you mean by a ResourceData method here, but perhaps my comment on the other question above addresses why it's considerably harder to get a good result for this at the ResourceData/FieldWriter layer and why I ultimately elected to make it a postprocessing step. If you're thinking of something different than what I described there then I'm happy to consider another approach!

helper/schema/schema.go

apparentlymart · 2022-02-08T22:50:42Z

(I'm hoisting this design note out of one of the resolved review conversation threads from above, so it's easier to see if we look back on this later to understand why it's implemented this way.)

I originally intended to make ResourceData.Set be the one to handle ignoring writes that don't materially change the existing value.

There were two big challenges with doing this on Set which caused me to abandon that direction in favor of this one:

DiffSuppressFunc's signature is designed to deal with already-flatmapped data, because that's what's in the legacy diff data structure it was originally built to filter. ResourceData is built around the "field reader" abstraction which works with arbitrary interface{} and so I found myself having to essentially reimplement the flatmap functionality at that layer in order to get suitable inputs for the DiffSuppressFunc call.
The ResourceData.Set API allows (and, in a lot of cases, requires) setting an entire nested data structure in a single call, which means visiting every leaf value in that data structure to decide whether to include it or not, and no part of the call stack in that area has all of the context necessary to make that decision.

While doing it after the fact against the InstanceState object is non-ideal, it benefits from more closely matching how the Diff method works with the InstanceDiff object as a post-processing step, and so it's a better fit for the existing DiffSuppressFunc design.

bflad

Thank you so much for working on this, @apparentlymart, looks good to me from a code perspective (just one small note about the terraform-plugin-log usage). Super excited that we can get a opt-in solution in place for this framework. 🚀 If you could also add a changelog entry, that would be great.

e.g. Adding a .changelog/882.txt file with something like:

```release-note:note
helper/schema: The `Schema` type `DiffSuppressOnRefresh` field can be enabled to check `DiffSuppressFunc` for normalization changes during refresh, similar to planning. This can prevent unintended "Values changed outside of Terraform" messages. For many attributes, this is likely desirable and expected behavior when implementing `DiffSuppressFunc`, however it is opt-in for backwards compatibility reasons.
```

```release-note:enhancement
helper/schema: Added the `DiffSuppressOnRefresh` field to the `Schema` type
```

bflad · 2022-02-10T20:42:02Z

helper/schema/schema.go

@@ -985,6 +1024,7 @@ func (m schemaMap) diff(
 					continue
 				}

+				log.Printf("[DEBUG] ignoring change of %q due to DiffSuppressFunc", attrK)


Thanks so much for adding this.

bflad · 2022-02-10T20:58:28Z

helper/schema/schema.go

+		}
+
+		if schema.DiffSuppressFunc(k, oldV, newV, d) {
+			tfsdklog.Debug(ctx, "ignoring change of %q due to DiffSuppressFunc", k)


A quirk of the current terraform-plugin-log implementation and its logging functions such as tfsdklog.Debug() is that the final variadic argument is expected as key-value pairs rather than operating like the f suffix of other string format functions. There is nothing to catch an errant unpaired key implementation at compile-time. 😖 We'll either need to fmt.Sprintf() the value into the message parameter or switch it to a structured key-value pair:

tfsdklog.Debug(ctx, fmt.Sprintf("ignoring change of %q due to DiffSuppressFunc", k)) // or tfsdklog.Debug(ctx, "ignoring change due to DiffSuppressFunc", "tf_attribute", k)

I'll take this as real world feedback to create an issue in terraform-plugin-log to enforce this at compile time (e.g. convert those key-value expectations into a ...map[string]interface{} function parameter instead of the variadic ...interface{}) or otherwise create static analysis tooling to warn of this type of issue. I think we'd shy away from a Debugf() variant as it would discourage the structured key-value pairs.

Ahh, okay! Sorry... should've looked more closely for other examples before jumping in here.

I think I'm going to follow your first example here just because it'll then match with the other log-based one I added in the Diff function, and because it doesn't seem super valuable to be able to filter by this key and thus structured logging is not super important here.

Unfortunately in the original design of DiffSuppressFunc we made it work only during planning, and so the normalization vs. drift classification it does isn't honored in any other situation. Unilaterally enabling it for both refresh and apply would be too risky at this late stage where so many existing provider behaviors rely on these subtle details, but here we introduce a more surgical _opt-in_ mechanism whereby specific attributes can be set to also check the Refresh result against DiffSuppressFunc, and so avoid reporting any normalization done by the remote system as if it were drift. This behavior will be primarily useful for strings containing complex data serialized into some particular microsyntax which allows expressing the same information multiple ways. In particular, existing situations where we classify whitespace changes and other such immaterial changes in JSON values as normalization vs. drift would be a good candidate to enable this flag, so that we can get the same results during refreshing and thus avoid churning any downstream resources that refer to the unexpectedly-updated result.

apparentlymart · 2022-02-16T23:39:14Z

Oh whoops 🤦‍♂️ I just noticed that I didn't actually push up my updated commit after I addressed the previous round of feedback. Sorry about that! I think I've now belatedly addressed everything we discussed so far. 🤞

apparentlymart · 2022-02-17T00:45:57Z

For what it's worth, I have made a small attempt to add a terraform-provider-corner test case that exercises this today, but I can't figure out how to actually describe the situation because the test harness is built for testing only final states and doesn't offer any way to make assertions about the refresh or planning steps alone.

What I've attempted is to add some normalization to the domain part of the email address in the fake "user" object, to observe that normalization being misclassified as drift and then cascading downstream during refresh. I can see it fail before I add the DiffSuppressFunc, but I can't see any way to distinguish between the previous DiffSuppressFunc behavior and the new DiffSuppressOnRefresh: true behavior, because by the time the Check function is running the results from ReadResource and PlanResourceChange are long forgotten and we only have access to the ApplyResourceChange result.

I've run out of time to spend on that for now, so I'm afraid I won't be able to contribute a corner-provider test for this, for now at least. 😖

bflad

Still looks good to me and going to pull this in. Thanks so much, @apparentlymart 🚀

Aside: A design goal of our next generation of provider-level acceptance testing would be to offer multiple layers of abstraction during testing. Theoretically, this would include low level functionality for choosing individual Terraform CLI commands to execute and inspecting any available Terraform CLI data, if necessary. Once that's available, we can convert terraform-provider-corner over to it and hopefully be able to test many more of these types of scenarios that the current acceptance testing framework cannot handle due to its solely high level abstractions.

github-actions · 2022-03-20T02:14:36Z

I'm going to lock this pull request because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

apparentlymart requested a review from a team as a code owner February 4, 2022 03:24

apparentlymart self-assigned this Feb 4, 2022

detro reviewed Feb 4, 2022

View reviewed changes

apparentlymart force-pushed the f-diff-suppress-on-refresh branch from b01d95c to 98e4260 Compare February 8, 2022 16:54

apparentlymart requested a review from detro February 8, 2022 16:57

detro approved these changes Feb 8, 2022

View reviewed changes

bflad reviewed Feb 8, 2022

View reviewed changes

apparentlymart force-pushed the f-diff-suppress-on-refresh branch from 98e4260 to b6d348f Compare February 8, 2022 22:48

apparentlymart requested a review from bflad February 8, 2022 22:55

bflad approved these changes Feb 10, 2022

View reviewed changes

bflad added this to the v2.11.0 milestone Feb 10, 2022

bflad mentioned this pull request Feb 10, 2022

Lack of Feedback about Unpaired Structured Key-Values hashicorp/terraform-plugin-log#31

Closed

apparentlymart force-pushed the f-diff-suppress-on-refresh branch from b6d348f to 84f804f Compare February 16, 2022 23:37

bflad approved these changes Feb 17, 2022

View reviewed changes

bflad added the enhancement New feature or request label Feb 17, 2022

bflad merged commit 6de4ff0 into main Feb 17, 2022

bflad deleted the f-diff-suppress-on-refresh branch February 17, 2022 18:45

ewbankkit mentioned this pull request Feb 19, 2022

Perpetual diffs / Objects have changed outside of Terraform (v3.70.0+) hashicorp/terraform-provider-aws#23288

Closed

fho mentioned this pull request Mar 15, 2022

sdk: use the new DiffSuppressOnRefresh functions simplesurance/terraform-provider-bunny#52

Open

github-actions bot locked as resolved and limited conversation to collaborators Mar 20, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

helper/schema: Schema.DiffSuppressOnRefresh #882

helper/schema: Schema.DiffSuppressOnRefresh #882

apparentlymart commented Feb 4, 2022 •

edited

apparentlymart commented Feb 4, 2022

detro commented Feb 4, 2022

detro left a comment

detro Feb 4, 2022

detro Feb 4, 2022

apparentlymart Feb 4, 2022

detro left a comment

bflad left a comment

bflad Feb 8, 2022

apparentlymart Feb 8, 2022

apparentlymart commented Feb 8, 2022 •

edited

bflad left a comment

bflad Feb 10, 2022

bflad Feb 10, 2022

apparentlymart Feb 10, 2022

apparentlymart commented Feb 16, 2022

apparentlymart commented Feb 17, 2022

bflad left a comment

github-actions bot commented Mar 20, 2022

helper/schema: Schema.DiffSuppressOnRefresh #882

helper/schema: Schema.DiffSuppressOnRefresh #882

Conversation

apparentlymart commented Feb 4, 2022 • edited

apparentlymart commented Feb 4, 2022

detro commented Feb 4, 2022

detro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

detro left a comment

Choose a reason for hiding this comment

bflad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apparentlymart commented Feb 8, 2022 • edited

bflad left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

apparentlymart commented Feb 16, 2022

apparentlymart commented Feb 17, 2022

bflad left a comment

Choose a reason for hiding this comment

github-actions bot commented Mar 20, 2022

apparentlymart commented Feb 4, 2022 •

edited

apparentlymart commented Feb 8, 2022 •

edited