Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: The AWS Redshift Serverless resources created using resource "aws_redshiftdata_statement" are NOT maintaining its state after few days #37301

Open
math77j opened this issue May 6, 2024 · 1 comment
Labels
bug Addresses a defect in current functionality. service/redshiftdata Issues and PRs that pertain to the redshiftdata service.

Comments

@math77j
Copy link

math77j commented May 6, 2024

Terraform Core Version

1.7.1

AWS Provider Version

5.39.0

Affected Resource(s)

aws_redshiftdata_statement

Expected Behavior

We are using the terraform resources "aws_redshiftserverless_namespace" & "aws_redshiftserverless_workgroup" to create a Redshift serverless environment and "aws_redshiftdata_statement" to create the database resources within this serverless environment. These database resources include an external schema, and a few materialized views to load stream data from Kafka topics on the AWS MSK cluster.

When we run the Terraform script, the resources are created successfully. But after a few days, even though there was NO change to the script, the Terraform plan shows that these resources needed to be added again. Also, you can see that the Terraform state shows these resources already exist.

The expected behavior is that it should show that the resources don't change state after a few days.

Actual Behavior

The terraform plan shows that all the redshift resources created using "aws_redshiftdata_statement" are new or needed to be added.

Relevant Error/Panic Output Snippet

(base) mjoseph@MathewJ redshift_poc % terraform plan     
aws_cloudwatch_event_rule.rs_mv_refresh_schedule_rule: Refreshing state... [id=rs-poc-mv-refresh-schedule-rule]
aws_iam_role.rs_iam_role: Refreshing state... [id=rs-poc-iam-role]
aws_redshiftserverless_namespace.rs_poc_ns: Refreshing state... [id=rs-poc-ns]
aws_redshiftserverless_workgroup.rs_poc_wg: Refreshing state... [id=rs-poc-wg]
aws_redshiftdata_statement.create_external_schema: Refreshing state... [id=bf4d488f-5f1b-457a-869f-0f74c08fec3b]
aws_redshiftdata_statement.create_mv_subscriber: Refreshing state... [id=841662f3-bc43-4299-8beb-a4a42003b65b]
aws_redshiftdata_statement.create_view_device_metadata: Refreshing state... [id=fb353278-ec32-48c2-89f4-87e5831404c9]
aws_redshiftdata_statement.create_mv_device_metadata_json: Refreshing state... [id=da90b707-7696-41a9-ae08-fa94866b71dd]
aws_cloudwatch_event_target.rs_mv_refresh_schedule_target: Refreshing state... [id=rs-poc-mv-refresh-schedule-rule-rs_mv_refresh_schedule_target]

Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  # aws_redshiftdata_statement.create_external_schema will be created
  + resource "aws_redshiftdata_statement" "create_external_schema" {
      + database       = "rs_poc_db"
      + id             = (known after apply)
      + secret_arn     = "arn:aws:secretsmanager:us-east-X:XXXXXXXX:secret:redshift!rs-poc-ns-rsadmin-Nbii95"
      + sql            = <<-EOT
            DROP SCHEMA IF EXISTS rs_poc_kafka_schema CASCADE;
            CREATE EXTERNAL SCHEMA rs_poc_kafka_schema 
                FROM MSK 
                    IAM_ROLE default 
                    AUTHENTICATION none 
                    CLUSTER_ARN 'arn:aws:kafka:us-east-X:XXXXXX:cluster/dev-data-hub-Kafka/5b087aa2-54a9-43fa-8185-b33fa1d45d30-7';
        EOT
      + workgroup_name = "rs-poc-wg"
    }


  # aws_redshiftdata_statement.create_mv_subscriber will be created
  + resource "aws_redshiftdata_statement" "create_mv_subscriber" {
      + database       = "rs_poc_db"
      + id             = (known after apply)
      + secret_arn     = "arn:aws:secretsmanager:us-east-2:XXXXXXXX:secret:redshift!rs-poc-ns-rsadmin-Nbii95"
      + sql            = <<-EOT
            DROP MATERIALIZED VIEW IF EXISTS mv_subscriber;
                            CREATE MATERIALIZED VIEW mv_subscriber DISTKEY(1) SORTKEY(2) AS
                                SELECT
                                    JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'id', true)::varchar(64) AS id,
                                    NULLIF(JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'timestamp', true), ''):: BIGINT as event_ts,
                                    JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'identifiers', 'Account Number', true)::varchar(64) as account_number,
                                    JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'attributes', 'zipCode', true)::varchar(64) as zip_code
                                FROM
                                    rs_poc_kafka_schema."subscriber"
                                WHERE CAN_JSON_PARSE(kafka_value);
                            REFRESH MATERIALIZED VIEW mv_subscriber;
        EOT
      + workgroup_name = "rs-poc-wg"
    }


Plan: 2 to add, 0 to change, 0 to destroy.

Terraform Configuration Files

1. Enable AWS Provider

provider "aws" {
region = "us-east-2"
}

2. Create Redshift Serverless Namespace

resource "aws_redshiftserverless_namespace" "rs_poc_ns" {
namespace_name = "${var.resource_prefix}-ns"
admin_username = var.rs_admin_username
manage_admin_password = true
db_name = var.rs_db_name
iam_roles = [ aws_iam_role.rs_iam_role.arn ]
default_iam_role_arn = aws_iam_role.rs_iam_role.arn
tags = {
Environment = var.environment
}
}

2.1 Create Redshift Serverless Workgroup

resource "aws_redshiftserverless_workgroup" "rs_poc_wg" {
depends_on = [ aws_redshiftserverless_namespace.rs_poc_ns ]
namespace_name = "${var.resource_prefix}-ns"
workgroup_name = "${var.resource_prefix}-wg"
base_capacity = 8
publicly_accessible = false
enhanced_vpc_routing = true
subnet_ids = var.rs_poc_subnet_ids
security_group_ids = var.rs_poc_security_group_ids
config_parameter {
parameter_key = "enable_case_sensitive_identifier"
parameter_value = true
}
tags = {
Environment = var.environment
}
}

3 Create Redshift external schema using - aws_redshiftdata_statement

resource "aws_redshiftdata_statement" "create_external_schema" {
workgroup_name = aws_redshiftserverless_workgroup.rs_poc_wg.workgroup_name
database = aws_redshiftserverless_namespace.rs_poc_ns.db_name
secret_arn = aws_redshiftserverless_namespace.rs_poc_ns.admin_password_secret_arn
sql = <<-EOT
DROP SCHEMA IF EXISTS ${var.rs_msk_external_schema_name} CASCADE;
CREATE EXTERNAL SCHEMA ${var.rs_msk_external_schema_name}
FROM MSK
IAM_ROLE default
AUTHENTICATION none
CLUSTER_ARN '${var.msk_cluster_arn}';
EOT
}

4 Create Redshift materialized view using - aws_redshiftdata_statement

resource "aws_redshiftdata_statement" "create_mv_subscriber" {
depends_on = [ aws_redshiftdata_statement.create_external_schema ]
workgroup_name = aws_redshiftserverless_workgroup.rs_poc_wg.workgroup_name
database = aws_redshiftserverless_namespace.rs_poc_ns.db_name
secret_arn = aws_redshiftserverless_namespace.rs_poc_ns.admin_password_secret_arn
sql = <<EOT
DROP MATERIALIZED VIEW IF EXISTS mv_subscriber;
CREATE MATERIALIZED VIEW mv_subscriber DISTKEY(1) SORTKEY(2) AS
SELECT
JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'id', true)::varchar(64) AS id,
NULLIF(JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'timestamp', true), ''):: BIGINT as event_ts,
JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'identifiers', 'Account Number', true)::varchar(64) as account_number,
JSON_EXTRACT_PATH_TEXT(FROM_VARBYTE(kafka_value, 'utf-8'), 'attributes', 'zipCode', true)::varchar(64) as zip_code
FROM
${var.rs_msk_external_schema_name}."subscriber"
WHERE CAN_JSON_PARSE(kafka_value);
REFRESH MATERIALIZED VIEW mv_subscriber;
EOT
}

Steps to Reproduce

Run terraform plan/apply to create the resources, and after a few days run: terraform plan
This will show that the same resources need to be added again.

Debug Output

No response

Panic Output

No response

Important Factoids

No response

References

No response

Would you like to implement a fix?

None

@math77j math77j added the bug Addresses a defect in current functionality. label May 6, 2024
Copy link

github-actions bot commented May 6, 2024

Community Note

Voting for Prioritization

  • Please vote on this issue by adding a 👍 reaction to the original post to help the community and maintainers prioritize this request.
  • Please see our prioritization guide for information on how we prioritize.
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request.

Volunteering to Work on This Issue

  • If you are interested in working on this issue, please leave a comment.
  • If this would be your first contribution, please review the contribution guide.

@github-actions github-actions bot added service/redshiftdata Issues and PRs that pertain to the redshiftdata service. service/redshiftserverless Issues and PRs that pertain to the redshiftserverless service. labels May 6, 2024
@math77j math77j changed the title [Bug]: The AWS Redshift Serverless resources created using resource "aws_redshiftdata_statement" are maintaining its state after few days [Bug]: The AWS Redshift Serverless resources created using resource "aws_redshiftdata_statement" are NOT maintaining its state after few days May 6, 2024
@terraform-aws-provider terraform-aws-provider bot added the needs-triage Waiting for first response or review from a maintainer. label May 6, 2024
@justinretzolk justinretzolk removed needs-triage Waiting for first response or review from a maintainer. service/redshiftserverless Issues and PRs that pertain to the redshiftserverless service. labels May 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. service/redshiftdata Issues and PRs that pertain to the redshiftdata service.
Projects
None yet
Development

No branches or pull requests

2 participants