Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single Nesting Mode Blocks Not Null in PlanResourceChange ProposedNewState #32460

Closed
bflad opened this issue Jan 4, 2023 · 6 comments · Fixed by #32463
Closed

Single Nesting Mode Blocks Not Null in PlanResourceChange ProposedNewState #32460

bflad opened this issue Jan 4, 2023 · 6 comments · Fixed by #32463
Assignees
Labels
bug confirmed a Terraform Core team member has reproduced this issue core

Comments

@bflad
Copy link
Member

bflad commented Jan 4, 2023

Terraform Version

Terraform v1.3.6
on darwin_arm64

Terraform Configuration Files

First apply:

resource "hashicups_order" "test" {
  myblock {
    optional = false
    optional_int = 10
  }
}

Second apply:

resource "hashicups_order" "test" {}

Debug Output

Please reach out if you need this.

Expected Behavior

When applying the second configuration without the single nesting mode block, the proposed new state for the block is null to match the null configuration -- causing the plan succeed without provider-side modification.

Actual Behavior

Terraform returns an error due to the proposed new state not being null:

Error: Provider produced invalid plan
        
Provider "registry.terraform.io/hashicorp/hashicups" planned an invalid value
for hashicups_order.test.myblock: planned for existence but config wants
absence.

This is a bug in the provider, which should be reported in the provider's own
issue tracker

Using the TF_LOG_SDK_PROTO_DATA_DIR environment variable, such as TF_LOG_SDK_PROTO_DATA_DIR=/tmp, will save files containing MessagePack data from the protocol before it reaches terraform-plugin-framework or provider logic. Viewing those files via https://github.com/wader/fq shows the disparity between the configuration and proposed new state data sent during PlanResourceChange.

❯ fq -d msgpack tovalue 1672845198527_PlanResourceChange_Request_Config.msgpack
{
  "length": 2,
  "pairs": [
    {
      "key": {
        "length": 2,
        "type": "fixstr",
        "value": "id"
      },
      "value": {
        "type": "nil",
        "value": null
      }
    },
    {
      "key": {
        "length": 7,
        "type": "fixstr",
        "value": "myblock"
      },
      "value": {
        "type": "nil",
        "value": null
      }
    }
  ],
  "type": "fixmap"
}

❯ fq -d msgpack tovalue 1672845198527_PlanResourceChange_Request_ProposedNewState.msgpack
{
  "length": 2,
  "pairs": [
    {
      "key": {
        "length": 2,
        "type": "fixstr",
        "value": "id"
      },
      "value": {
        "length": 1,
        "type": "fixstr",
        "value": "1"
      }
    },
    {
      "key": {
        "length": 7,
        "type": "fixstr",
        "value": "myblock"
      },
      "value": {
        "length": 2,
        "pairs": [
          {
            "key": {
              "length": 8,
              "type": "fixstr",
              "value": "optional"
            },
            "value": {
              "type": "false",
              "value": false
            }
          },
          {
            "key": {
              "length": 12,
              "type": "fixstr",
              "value": "optional_int"
            },
            "value": {
              "type": "positive_fixint",
              "value": 10
            }
          }
        ],
        "type": "fixmap"
      }
    }
  ],
  "type": "fixmap"
}

Please note if you want to create these files yourself, you likely need hashicorp/terraform-plugin-go#245, to prevent the files from being overwritten across acceptance test steps since the file naming is not time granular enough.

If the provider logic manually modifies the planned new state to match the configuration when its null, then the Terraform error goes away.

// Refer also to the framework issue, which has a schema-defined
// plan modifier workaround in the comments. This is just a little more
// copy-pastable into the reproduction codebase.
func (r *orderResource) ModifyPlan(ctx context.Context, req resource.ModifyPlanRequest, resp *resource.ModifyPlanResponse) {
	if req.State.Raw.IsNull() {
		return
	}

	if req.Plan.Raw.IsNull() {
		return
	}

	var config, plan orderResourceModel

	resp.Diagnostics.Append(req.Config.Get(ctx, &config)...)
	resp.Diagnostics.Append(req.Plan.Get(ctx, &plan)...)

	if resp.Diagnostics.HasError() {
		return
	}

	if config.MyBlock == nil {
		plan.MyBlock = nil
	}

	resp.Diagnostics.Append(resp.Plan.Set(ctx, &plan)...)
}

Steps to Reproduce

  1. gh repo clone mvantellingen/terraform-pf-testcase
  2. cd terraform-pf-testcase
  3. TF_ACC=1 go test -count=1 -v ./...

Additional Context

Schema definition in terraform-plugin-framework:

func (r *orderResource) Schema(_ context.Context, _ resource.SchemaRequest, resp *resource.SchemaResponse) {
	resp.Schema = schema.Schema{
		Description: "Manages an order.",
		Attributes: map[string]schema.Attribute{
			"id": schema.StringAttribute{
				Description: "Numeric identifier of the order.",
				Computed:    true,
				PlanModifiers: []planmodifier.String{
					stringplanmodifier.UseStateForUnknown(),
				},
			},
		},
		Blocks: map[string]schema.Block{
			"myblock": schema.SingleNestedBlock{
				Attributes: map[string]schema.Attribute{
					"optional": schema.BoolAttribute{
						Optional: true,
					},
					"optional_int": schema.Int64Attribute{
						Optional: true,
					},
				},
			},
		},
	}
}

References

@bflad bflad added bug new new issue not yet triaged labels Jan 4, 2023
@bflad
Copy link
Member Author

bflad commented Jan 4, 2023

Adjusting the single nesting mode block proposed new state into null should be safe for terraform-plugin-sdk's timeouts block code. When that SDK converts data, the shims will skip setting map[string]any keys for null values. The timeouts decoding only occurs when the map key is set.

@jbardin jbardin added core confirmed a Terraform Core team member has reproduced this issue and removed new new issue not yet triaged labels Jan 4, 2023
@jbardin jbardin self-assigned this Jan 4, 2023
@apparentlymart
Copy link
Member

By coincidence I found what I think is a different form of exactly the same bug yesterday while working on something entirely unrelated. 🙃

I was trying out the provider mocking part of the prototype over in #32321. The provider mocking prototype there works by wrapping an existing provider to retain its GetSchema and Validate* functions, but it intercepts the ConfigureProvider function and all of the functions that would normally be called after a successful configure to return fake responses that match the underlying provider's schema.

When wrapping SDKv2 providers that have resource types that use the SDK's special timeouts block type, I found that the proposed new state for a configuration without that block came over as an unknown value of the type given in the resource type's schema for that block type, rather than a null value. By default (unless configured otherwise) the mock just echoes back the proposed new state as the plan, and so I was returning back the same unknown timeouts attribute value. Terraform Core's safety checks then flagged that as an invalid response because it was null in the configuration.

In my case I was testing the initial creation of the object rather than removing a block that had existed earlier, so I think this is the same bug but with Terraform core inserting an unknown value as a placeholder rather than retaining the previous value as we saw in the report above.

I don't yet have a minimal reproduction independent of the module testing prototype but I think the minimal parts to reproduce it are:

  • Have a resource type with a nested block type that has whatever nesting mode SDKv2 uses for the timeouts block.
  • Make the provider respond to PlanResourceChange for that resource type by just echoing back the proposed new state verbatim.
  • Plan to create a resource of that type whose configuration does not include the nested block in question.

I was able to work around it for what I was doing by adding an empty timeouts block to make the proposed new state include an empty object instead of an unknown value. I think the correct behavior here would be to send either a null value if this is NestingSingle, or a value of the appropriate object type but with all of the attributes set to null if this is NestingGroup. (I'm not sure which of the two SDKv2 uses for timeouts.)

@jbardin
Copy link
Member

jbardin commented Jan 5, 2023

Thanks @apparentlymart! The original error was from objchange.ProposeNew creating an invalid plan when a prior NestingSingle block changes to null in the config, which is a relatively simple fix.

I'm not sure yet how to replicate the unknown value you saw, because Terraform does not insert new unknown values into the plan 😕. I suspect that is more likely being done in the provider shims, but I have not looked into that codepath yet.

@apparentlymart
Copy link
Member

For my mocking prototype in particular it's notable that it's running the provider's own GetSchema and ValidateResourceConfig but it's not running the provider's own PlanResourceChange, so I don't think the SDKv2 shims can be blamed for that.

But since I see you now have a PR up I'll try incorporating that PR into my branch and see if that has also somehow fixed that problem, or if that's something different going on. (Could well also be a bug in my own prototype code, since it's doing some pretty gross things to get the job done even though we don't really have any facilities in Terraform Core for ergonomic provider development. 🙄 )

@apparentlymart
Copy link
Member

I can't explain exactly why, but #32463 did actually fix the weird quirk in the provider mocking prototype.

My hunch is that the non-null timeouts value in the proposed state was interacting poorly with some of the shonky logic I wrote for merging the mock result with the proposed state and so my code was making the object unknown somehow, but now timeouts is arriving as a real null the mock code is just echoing it back as null too and so it's no longer tripping up the safety check.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Feb 11, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug confirmed a Terraform Core team member has reproduced this issue core
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants