Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CORE-526 k/configs: dont override topic-level cleanup policy #18284

Draft
wants to merge 4 commits into
base: dev
Choose a base branch
from

Conversation

pgellert
Copy link
Contributor

@pgellert pgellert commented May 7, 2024

This fixes a bug in the kafka layer where redpanda would set the cleanup.policy config for every new topic even if this was not explicitly specified by the topic creation request. Instead, the expected behaviour is that this topic-level config is not set, but instead, redpanda should fall back in the case of an unset cleanup.policy to the cluster-level default without hardcoding that default at the topic-level.

Fixes https://redpandadata.atlassian.net/browse/CORE-526

Backports Required

  • none - not a bug fix
  • none - this is a backport
  • none - issue does not exist in previous branches
  • none - papercut/not impactful enough to backport
  • v24.1.x
  • v23.3.x
  • v23.2.x

Release Notes

Bug Fixes

  • This fixes a bug in kafka topic configs, where the cleanup.policy config was always set at the topic-level to the cluster default even when the topic creation request did not specify this.

@pgellert pgellert self-assigned this May 7, 2024
@pgellert
Copy link
Contributor Author

pgellert commented May 7, 2024

/ci-repeat 1

@vbotbuildovich
Copy link
Collaborator

vbotbuildovich commented May 7, 2024

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-adaa-43bf-9249-2318001540b4:

"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-ada3-4529-887b-824945435606:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53af-ada5-4198-9eee-4a91dd776426:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53dd-c573-4afe-b060-4d7d62feb2fe:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.S3"

new failures in https://buildkite.com/redpanda/redpanda/builds/48786#018f53dd-c570-4f11-9233-a90a3aa6bbfe:

"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_size_based_retention.cloud_storage_type=CloudStorageType.ABS"

new failures in https://buildkite.com/redpanda/redpanda/builds/48846#018f5e08-f5fe-4944-8f66-e3380c646138:

"rptest.tests.write_caching_fi_e2e_test.WriteCachingFailureInjectionE2ETest.test_crash_all_with_consumer_group"

new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f7273-f84a-4495-8fb9-29fa17a76e04:

"rptest.tests.topic_creation_test.CreateSITopicsTest.topic_alter_config_test"
"rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types"

new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f727b-9e58-4f89-b413-172dbb379d3e:

"rptest.tests.topic_creation_test.CreateSITopicsTest.topic_alter_config_test"

new failures in https://buildkite.com/redpanda/redpanda/builds/49009#018f727b-9e5f-4753-8ff8-bf4d1baf5bd1:

"rptest.tests.describe_topics_test.DescribeTopicsTest.test_describe_topics_with_documentation_and_types"

@pgellert
Copy link
Contributor Author

pgellert commented May 8, 2024

/ci-repeat 1

@pgellert
Copy link
Contributor Author

/ci-repeat 1

1 similar comment
@pgellert
Copy link
Contributor Author

/ci-repeat 1

@pgellert pgellert changed the title [WIP] k/configs: dont override topic-level cleanup policy k/configs: dont override topic-level cleanup policy May 14, 2024
@pgellert pgellert requested a review from Lazin May 14, 2024 11:30
@pgellert pgellert changed the title k/configs: dont override topic-level cleanup policy CORE-526 k/configs: dont override topic-level cleanup policy May 14, 2024
The first attempt at fixing the way to report cleanup.policy was
incorrect, because it meant that we reported the config as
DEFAULT_CONFIG even if the config was explicitly set but to the
cluster-default value.

Instead, in this case, we want to report this as a DYNAMIC_TOPIC_CONFIG.
We can do this now since we stopped setting the cleanup.policy value
explicitly for each topic.

Furthermore, the compat tests was simplified because now there are 2
separate fix attepts deployed at multiple redpanda versions, so to keep
the test sane, we just ignore the diff until a version (24.2.1) where we
know that this version will contain the fixes and all following version
we might compare against also contains the fixes.

Lastly, the change to the DescribeTopicConfigs ducktape test was also
reverted to correctly report the cleanup.policy as DYNAMIC_TOPIC_CONFIG
when this was explicitly specified on the topic creation request.
@pgellert pgellert force-pushed the configs/fix-cleanup-policy-source branch from bac0867 to 6c7d2ca Compare May 14, 2024 18:37
@pgellert
Copy link
Contributor Author

/ci-repeat 1

Copy link
Contributor

@Lazin Lazin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You also need to update the place where we generate topic_manifest in the ntp_config.
Also, when we update the cleanup policy flag in the topic config we have to restart archiver (in the partition::update_configuration the flag is set in this case). But with this change we also need to do the same when the configuration parameter changes.

static retention
get_retention_policy(const storage::ntp_config::default_overrides& prop) {
auto flags = prop.cleanup_policy_bitflags;
static retention get_retention_policy(const storage::ntp_config& prop) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why this change is needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants