Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Default timeout of 600 secs in cephadm shell command execution is not honoured #3294

Open
SrinivasaBharath opened this issue Jan 10, 2024 · 1 comment

Comments

@SrinivasaBharath
Copy link
Contributor

SrinivasaBharath commented Jan 10, 2024

Describe the bug
Automation script failed due to the continious calling of rados bench

To Reproduce
Steps to reproduce the behavior:
Execute the following test suite- suites/pacific/rados/tier-3_rados_ssd-tests.yaml

Environment
The CLI command is -
python3 run.py --cloud baremetal --instances-name rados_baremetal --rhbuild 7.0 --platform rhel-9 --global-conf conf/reef/baremetal/mero_4_node_4_client_conf --suite suites/pacific/rados/tier-3_rados_ssd-tests.yaml --inventory conf/inventory/rhel-9.2-server-x86_64-large.yaml --build latest --store

Additional Information

Error snippet:

2024-01-08 19:18:17,730 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.rados.core_workflows.py:426 - check_ec: True
2024-01-08 19:18:17,731 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1559 - Running command cephadm -v shell -- sudo rados --no-log-to-stderr -b 256 -p rpool_1 bench 30 write --no-cleanup on 10.8.129.226 timeout 600
2024-01-08 19:18:49,904 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1593 - Command completed successfully
2024-01-08 19:19:04,907 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1559 - Running command ceph df -f json on 10.8.128.60 timeout 300
2024-01-08 19:19:05,527 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1593 - Command completed successfully
2024-01-08 19:19:05,528 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.rados.core_workflows.py:451 - Objs in the rpool_1 before IOPS: 0 | Objs in the pool post IOPS: 94856 | Expected 94856 > 0
2024-01-08 19:19:05,529 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1559 - Running command cephadm -v shell -- ceph osd erasure-code-profile set ecprofile_ecpool_1 crush-failure-domain=osd k=4 m=2 plugin=jerasure on 10.8.129.226 timeout 600
2024-01-08 19:19:06,978 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1593 - Command completed successfully
2024-01-08 19:19:06,979 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1559 - Running command cephadm -v shell -- ceph osd erasure-
.......................
..............................................
2024-01-08 19:19:18,665 (cephci.test_osd_rebalance) [INFO] - cephci.ceph.ceph.py:1559 - Running command cephadm -v shell -- sudo rados --no-log-to-stderr -b 256 -p ecpool_1 bench 30 write --no-cleanup on 10.8.129.226 timeout 600
2024-01-08 21:37:56,465 (cephci.test_osd_rebalance) [ERROR] - cephci.run.py:805 -

The script stuck at this point for 2 hours and I killed the script.

Log: http://magna002.ceph.redhat.com/cephci-jenkins/cephci-run-JZ469N

@SrinivasaBharath SrinivasaBharath changed the title Bug Automation script failed due to the continious calling of rados bench Feb 6, 2024
@harshkumarRH harshkumarRH changed the title Automation script failed due to the continious calling of rados bench Default timeout of 600 secs in cephadm shell command execution is not honoured Mar 26, 2024
Copy link

This Issue has been marked as STALE due to inactivity for 60 days. and will be CLOSED after another 30 days on further inactivity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant