Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

azurerm_*_web_app_slot,azurerm_*_web_app - Remove unwanted app settings values during update #25661

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

Apokalypt
Copy link
Contributor

Community Note

  • Please vote on this PR by adding a 馃憤 reaction to the original PR to help the community and maintainers prioritize for review
  • Please do not leave "+1" or "me too" comments, they generate extra noise for PR followers and do not help prioritize for review

Description

During an update, app settings are sent two times :

  1. By the method CreateOrUpdate
  2. By the method UpdateApplicationSettings

The problem is that on the second call, the "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" value is set, regardless of the user's terraform configuration.
In fact, if the user has configured site_config.0.health_check_eviction_time_in_min, the correct value will be set, but if it is not configured, the value 0 will be set (from memory, without warning in the plan).

PR Checklist

  • I have followed the guidelines in our Contributing Documentation.
  • I have checked to ensure there aren't other open Pull Requests for the same update/change.
  • I have checked if my changes close any open issues. If so please include appropriate closing keywords below.
  • I have updated/added Documentation as required written in a helpful and kind way to assist users that may be unfamiliar with the resource / data source.
  • I have used a meaningful PR title to help maintainers and other users understand this change and help prevent duplicate work.

Changes to existing Resource / Data Source

  • I have added an explanation of what my changes do and why I'd like you to include them (This may be covered by linking to an issue above, but may benefit from additional explanation).
  • I have written new tests for my resource or datasource changes & updated any relevent documentation.
  • I have successfully run tests with my changes locally. If not, please provide details on testing challenges that prevented you running the tests.

Testing

Sorry, but I couldn't run acceptance tests because I don't have enough permissions in my team to make them run properly. However, some tests were run manually and some checks/tests were run directly with the API to make sure everything worked despite my technical limitations.

One thing to note is that the acceptance test TestAccWindowsWebAppSlot_completeUpdate may be failing but this fail comes from another change since the test fails without my code!

Change Log

  • azurerm_linux_web_app - Don't set "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" during update if it's not configured by the user
  • azurerm_linux_web_app_slot - Don't set "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" during update if it's not configured by the user
  • azurerm_windows_web_app - Don't set "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" during update if it's not configured by the user
  • azurerm_windows_web_app_slot - Don't set "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" during update if it's not configured by the user

This is a (please select all that apply):

  • Bug Fix
  • New Feature (ie adding a service, resource, or data source)
  • Enhancement
  • Breaking Change

Related Issue(s)

None

@Apokalypt Apokalypt changed the title azurerm_*_web_app_slot,azurerm_*_web_app - Remove unwanted app settings values azurerm_*_web_app_slot,azurerm_*_web_app - Remove unwanted app settings values during update Apr 18, 2024
Copy link
Member

@stephybun stephybun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for splitting this out into a separate PR @Apokalypt.

Having done some digging on this, it appears that this behaviour you're trying to change is intended and changing it would introduce a regression.

Please see the original PR where this was added #19685
As well as these issues that were fixed by it #16302 #17546 #17123

Furthermore the tests that were added here don't appear to be highlighting any issues, e.g. I've copied and run them on main and they pass successfully without any diffs.

As far as I can tell, if health_check_eviction_time_in_min isn't set in the user's config then all we're doing here is setting it to the APIs default value which means there shouldn't be any impact. Would you be able to provide some more detail on why this is causing issues for you?

@Apokalypt
Copy link
Contributor Author

Thanks for splitting this out into a separate PR @Apokalypt.

Having done some digging on this, it appears that this behaviour you're trying to change is intended and changing it would introduce a regression.

Please see the original PR where this was added #19685 As well as these issues that were fixed by it #16302 #17546 #17123

As far as I can tell, if health_check_eviction_time_in_min isn't set in the user's config then all we're doing here is setting it to the APIs default value which means there shouldn't be any impact. Would you be able to provide some more detail on why this is causing issues for you?

Oh, you're right @stephybun . With this such code I've created a regression. I will check in few minutes if I can find a better approach.

Regarding the problem, I'll try to be more precise. When my team tries to update the ip restrictions (or other value in the site config) of a web app (slot or not), they have observed that the web app restarts 2 times when using Terraform whereas it does not restart if they use the Azure interface. What's more, with Terraform, they found themselves with a new environment variable (WEBSITE_HEALTHCHECK_MAXPINGFAILURES) set to 0, even though they hadn't configured it on Terraform and health check was disabled on Azure...
After consulting the code, this is due to the management of app settings when there are managed keys (close to the problem encountered with the deleted node version).

Example

Let's imagine a web app without health check + without keys/values in app settings + with IP restrictions created using Terraform.

resource "azurerm_windows_web_app" "test" {
  name                = "...."
  location            = "...."
  resource_group_name = "...."
  service_plan_id     = "...."

  site_config {
    ip_restriction = [
      {
        name = "For test"
        ip_address = "192.0.1.4/32"
        priority = 100
        action = "Deny"
      }
    ]
  }

  app_settings {}
}

If we follow the code, here's what happens when we want to update an ip restriction (we have the same behavior with any value modified in the site config):

  1. Web app data is retrieved and parsed
  2. During parse, managed keys are deleted (code)
  3. When entering update method, site config is extended to modify ip restrictions (code)

At method exit, "SiteConfig.AppSettings" is empty (as initially)

  1. API call is made (code)

(if "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" wasn't already set by a previous TF update) No change in app settings but only in IP restrictions --> no application restart
(if "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" was already set by a previous TF update) Change detected in app settings ("WEBSITE_HEALTHCHECK_MAXPINGFAILURES" removed) --> application restarted

  1. Since there has been a change in the site config, we enter the if block whose purpose is to update the app settings (once again). (code)
  2. The "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" key is set to "0" because with the conversion nil to int becomes 0. (code)
  3. API call is made (code)

New key/value in app settings --> application restarted

Conclusion

When performing the above changes, the application should not restart and the key "WEBSITE_HEALTHCHECK_MAXPINGFAILURES" should not be set even to 0. Furthermore, when you do a plan before applying changes, the plan will not indicate that this app setting will be added...
Honestly, it's not a huge bug, but in practice it can cause unwanted shutdown periods (of varying length) on several web apps...

Tests

Furthermore the tests that were added here don't appear to be highlighting any issues, e.g. I've copied and run them on main and they pass successfully without any diffs.

I need to check how they works but it may be due to the fact that Web App is parsed and therefore the app settings is always removed (as shown in step 2 of the example)... Will give it another try 馃憤

@Apokalypt
Copy link
Contributor Author

I will wait for your answer @stephybun to know if you have any idea but also to make sure that this example help you to understand the problem (sorry to not have been clearer at the beginning)
In my opinion, beyond the problem of this variable, I think the problem is the way app settings' managed keys (docker, node, health check, ...) are handled...

@stephybun
Copy link
Member

Appreciate the detail @Apokalypt.

Whilst I'm not sure that setting WEBSITE_HEALTHCHECK_MAXPINGFAILURES to the default value on updates can be avoided given the behaviour of the API (it appears that this setting just gets lost on updates), I can understand that unnecessary and potentially redundant restarts to the Web App are undesired.

From conversations with our SME on App Service, the way the app_settings and site_config blocks are handled in the resources have been done with deliberation in order to work around several quirks in the API's behaviour. Given I am only just becoming acquainted with App Service I am going to have to defer to our SME @jackofallops on this one since it will require their expertise to determine what changes would be safe to make without introducing any regressions or unwanted behaviours.

@jackofallops mind taking a look at this one next week?

@Apokalypt
Copy link
Contributor Author

Apokalypt commented Apr 22, 2024

I've just had two ideas about this property to solve the "bug". In my opinion, it would be interesting to implement both (I can do it in this PR) @stephybun

1st Proposal

Set the value only if the "healthCheckPath" is provided since it's required to enable health check (cf documentation). Without this property, the environment variable is useless

2nd Proposal

Wouldn't it be interesting to add a control to allow only authorized values (between 2 and 10, cf documentation)? So, instead of always defining the environment variable, we could define it only when the value is valid. Since 0 is the default value in Go for an int64 when not manually defined, this would avoid setting a value when the user doesn't define it in his Terraform schema.

@jackofallops jackofallops self-assigned this Apr 23, 2024
@jackofallops
Copy link
Member

I've just had two ideas about this property to solve the "bug". In my opinion, it would be interesting to implement both (I can do it in this PR) @stephybun

1st Proposal

Set the value only if the "healthCheckPath" is provided since it's required to enable health check (cf documentation). Without this property, the environment variable is useless

2nd Proposal

Wouldn't it be interesting to add a control to allow only authorized values (between 2 and 10, cf documentation)? So, instead of always defining the environment variable, we could define it only when the value is valid. Since 0 is the default value in Go for an int64 when not manually defined, this would avoid setting a value when the user doesn't define it in his Terraform schema.

Hi @Apokalypt - I'm going to be taking a look over this while @stephybun is looking into another piece of work.

wrt to the 2nd proposal. We have a requirement in Terraform to use the values provided by user configuration and we need to be able to read those back or we will have a diff in the plan/post-apply as the config and state will not match if the value read back doesn't match what is in config, so we unfortunately cannot take this approach.

I'm going to take a look into the code paths around this to see if I can find a route capable of removing the undesired outcomes without introducing any regression. I'll post here or submit a review as soon as I have anything.

Thanks for your efforts so far and for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants