-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test: test_topology_ops: reenable background writes in tablets mode #17589
Comments
#17025 has been fixed, so this issue can be worked on. |
@tchaikov can you please look into this issue? |
after applying following patch modified test/topology_experimental_raft/test_topology_ops.py @@ -40,9 +40,7 @@ async def test_topology_ops(request, manager: ManagerClient, tablets_enabled: bo
await wait_for_cql_and_get_hosts(manager.cql, servers, time.time() + 60)
cql = await reconnect_driver(manager) - # FIXME: we disable background writes with tablets enabled because the test fails due to #17025.
- # We need to re-enable them once this issue is fixed (by removing `if` here and above `await finish_writes()`).
- finish_writes = await start_writes(cql) if not tablets_enabled else None
+ finish_writes = await start_writes(cql)
logger.info(f"Decommissioning node {servers[0]}") await manager.decommission_node(servers[0].server_id)
@@ -72,8 +70,7 @@ async def test_topology_ops(request, manager: ManagerClient, tablets_enabled: bo
servers = servers[1:]
logger.info("Checking results of the background writes")
- if not tablets_enabled:
- await finish_writes()
+ await finish_writes()
for server in servers: await check_node_log_for_failed_mutations(manager, server) i have following test failure
when running $ ./test.py --verbose --mode release topology_experimental_raft/test_topology_ops |
i am taking a closer look to understand the root cause. |
There are 4 nodes, RF=3, and we decommission and then removenode. So node count falls below RF. This is not supported with tablets. |
We could adjust the test to add one more node in tablets mode at the beginning. |
thank you for your insights, gentlemen. will add one more node to appease tablets. |
ping, guys, please don't forget about this issue |
@kbr-scylla i have a WIP locally. will send it tomorrow. thanks for the ping. |
in e7d4e08, we reenabled the background writes in this test, but when running with tablets enabled, background writes are still disabled because of scylladb#17025, which was fixed last week. so we can enable background writes with tablets. in this change, * background writes are enabled with tablets. * increase the number of nodes by 1 so that we have enough nodes to fulfill the needs of tablets, which enforces that the number of replicas should always satisfy RF. * pass rf to `start_writes()` explicitly, so we have less magic numbers in the test, and make the data dependencies more obvious. Fixes scylladb#17589 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in e7d4e08, we reenabled the background writes in this test, but when running with tablets enabled, background writes are still disabled because of scylladb#17025, which was fixed last week. so we can enable background writes with tablets. in this change, * background writes are enabled with tablets. * increase the number of nodes by 1 so that we have enough nodes to fulfill the needs of tablets, which enforces that the number of replicas should always satisfy RF. * pass rf to `start_writes()` explicitly, so we have less magic numbers in the test, and make the data dependencies more obvious. Fixes scylladb#17589 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
@bhalevy I believe it's a very important, basic test that we should have -- and milestone should be 6.0. If this test detects issues but we disable it, we will see the same issues later down the line in SCT runs. It's much easier to debug and fix those issues based on test.py output, and enable the test to prevent regressions, than debugging SCT longevity later. |
ok, let's restore the milestone to 6.0 then, to indicate the fix should backported there |
in e7d4e08, we reenabled the background writes in this test, but when running with tablets enabled, background writes are still disabled because of scylladb#17025, which was fixed last week. so we can enable background writes with tablets. in this change, * background writes are enabled with tablets. * increase the number of nodes by 1 so that we have enough nodes to fulfill the needs of tablets, which enforces that the number of replicas should always satisfy RF. * pass rf to `start_writes()` explicitly, so we have less magic numbers in the test, and make the data dependencies more obvious. Fixes scylladb#17589 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
in e7d4e08, we reenabled the background writes in this test, but when running with tablets enabled, background writes are still disabled because of scylladb#17025, which was fixed last week. so we can enable background writes with tablets. in this change, * background writes are enabled with tablets. * increase the number of nodes by 1 so that we have enough nodes to fulfill the needs of tablets, which enforces that the number of replicas should always satisfy RF. * pass rf to `start_writes()` explicitly, so we have less magic numbers in the test, and make the data dependencies more obvious. Fixes scylladb#17589 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
#17585 reenables background writes in
test_topology_ops
, but only when the test is run without tablets. The test fails with tablets because of #17025. Once that issue if fixed, we should reenable background writes with tablets intest_topology_ops
.Ideally, before sending and merging the fix, we should investigate if
test_topology_ops
with tablets and background writes is flaky. Before #17585, without tablets and with background writes the test was flaky in CI (1-2 failures in 1000 runs in dev mode).The text was updated successfully, but these errors were encountered: