Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not all Dashboards Listed in Dashboard Group when added with Dashboard Resource #290

Open
Collin256 opened this issue Mar 5, 2021 · 9 comments

Comments

@Collin256
Copy link

Issue Description:

When creating a dashboard group resource, and then creating dashboard resources outside of the dashboard group resource using the dashboard group ID to add them, not all dashboards will be successfully associated to the dashboard group in the UI. However, all dashboards will exist and can be viewed by manually putting their ID’s in the SignalFX url.

Expected Behavior:

A dashboard group is created and then 5 dashboards are created and listed in the group, and it is displayed in the signalFX UI under Custom Dashboard Groups.

Actual Behavior:

A dashboard group is created and 3 random dashboards (from the 5 in the terraform script) are listed in the group, and it is displayed in the SignalFX UI under Custom Dashboard Groups. All 5 dashboards are created.

Steps to Reproduce:

Run the following Commands:

  • terrform init
  • terraform plan (optional for consistancy -out=out.tfplan)
  • terraform apply -auto-approve (optional if plan file exists, out.tfplan)

Using the following terraform/provider versions:

  • Terraform: v0.13.4
  • splunk-terraform/signalfx: v6.7.1

With a terraform file as such:

resource "signalfx_dashboard_group" "test" {
    ...(no dashbord blocks in here)
}

resource "signalfx_dashboard" "test_dashboard_1" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    ...
}

resource "signalfx_dashboard" "test_dashboard_2" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    ...
}

resource "signalfx_dashboard" "test_dashboard_3" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    ...
}

resource "signalfx_dashboard" "test_dashboard_4" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    ...
}

resource "signalfx_dashboard" "test_dashboard_5" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    ...
}

Additional Context

Regenerating the plan, without making any changes within the terraform code, the output if the plan will indicate that there are no changes to be made to the infrastructure.

This can be a single terraform script which creates the dashboard group first, then the dashboards and adds them to the group, or this can be multiple scripts where the dashboard group is created in one execution plan, then the dashboards are created in a separate execution plan.

Workarounds

  1. It is possible to manually fix the broken link which seems to reside in the dashboard group object. Not all dashboards are listed in the "dashboards" collection in the dashboard group object. In the dashboard object the "groupId" is set to the correct dashboard group ID. Using the REST APP a PUT can update the dashboard group and add the missing "dashboard" values and missing "dashboardConfigs" objects.

  2. It is also possible to only add one dashboard to an existing dashboard group at a time through repeated terraform processing.

@Collin256
Copy link
Author

On further investigation it appears that this may be an issue in the REST API at https://api.us1.signalfx.com/v2/dashboard. In testing, a POST to this API to create a new dashboard triggered the update to the dashboard group object record. Possible race condition?

@PatrickShaw
Copy link

Hi @Collin256

Just a heads up. I'm experiencing the same issue. The workaround I use is involves using terraform taint to force dashboards to be deleted and recreated. It's not great and it comes with a whole bunch of caveats (slows down Terraform scripts, renders dashboard URLs temporary, etc). Here's the bash script:

#!/bin/bash

# The only reason script exists is because some combination of the following bug: https://github.com/splunk-terraform/terraform-provider-signalfx/issues/290
#
# It causes SignalFX to disappear from their respective dashboard groups.
# They're still technically there. Going to the dashboard manually takes you to its respective dashboard group.
# HOWEVER the only way you can navigate to them is manually, for example, via the "Recent" section under the "Dashboards" tab in SignalFX or by entering a URL manually.
#
# How does this script solve this problem?
# This script abuses the fact that dashboards show up properly when they are first added.
# It searches through all the SignalFX dashboards in the terraform state and marks them as "tainted".
# This tells Teraform that the dashboard's state is corrupted and needs to be recreated the next time Terraform is run.
#
# Please delete this script if the bug no longer occurrs or a less hacky fix is found

set -xe

terraform state list | grep "\.signalfx_dashboard\." | while read -r signalfx_dashboard_module_state_path; do
  terraform taint $signalfx_dashboard_module_state_path
done

@Collin256
Copy link
Author

Hi @PatrickShaw,

Thanks for this info, this is very helpful.

In the last couple days we've also found that we can inject dependency into the dashboard resources (or even modules if you are on terraform 13 or higher) to force them to be created in serial instead of parallel, and this seems to avoid the reace condition in the REST API. We've just updated our code as such:

resource "signalfx_dashboard" "test_dashboard_1" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    ...
}

resource "signalfx_dashboard" "test_dashboard_2" {
    dashboard_group = signalfx_dashboard_group.dashboard_group.id
    depends_on          = [signalfx_dashboard.test_dashboard_1]
    ...
}

@mcmiv413
Copy link

From our side we have also found that we can use the depends on to control the order in which dashboards are created, which impacts the order in which they are listed in the dashboardgroups, otherwise it appears to be random order and subsequent runs will end up in different orders.

@PatrickShaw
Copy link

PatrickShaw commented Apr 20, 2021

We just gave depends_on a go.

Works a charm :) it also maintains the dashboard ordering like @mcmiv413 mentioned which is an added bonus for us.

Thanks @Collin256 for the suggestion

@Collin256
Copy link
Author

I'm thinking I need to close this issue, as it is not really related to the terraform provider but the SFX API. However I think the "depends_on" information is highly valuable. for all who have participated on this thread, should this just be a documentation issue?

@PatrickShaw
Copy link

Should this just be a documentation issue?

Yeah, I think they'd be benefit in mentioning: #290 (comment) in the Terraform provider docs 🤔

@stil4m
Copy link

stil4m commented May 7, 2021

I'm having issues with this in combination with a for_each, this makes the solution @Collin256 provided not suitable for me (while depends on cannot be set dynamically).

Any other suggested workarounds?

@bohdanborovskyi-ma
Copy link

bohdanborovskyi-ma commented Sep 17, 2021

@stil4m I've solved the described issue in case of for_each in 2 steps:

  • moved relevant TF code (e.g. dashboard group creation and invocation of module which create & add new dashboards to that group) to isolated TF project with it's own state file;
  • tuned CI/CD pipeline which operates on that TF repo to execute terraform apply with -parallelism=1 flag specified (i.e. override default concurrency of 10).

In such way I've eliminated API race conditions when creating multiple dashboards at once while overall execution time for whole TF codebase wasn't increased significantly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants