Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash of the replica-recoverer when adding rules #6597

Closed
cserf opened this issue Mar 25, 2024 · 0 comments · Fixed by #6602
Closed

Crash of the replica-recoverer when adding rules #6597

cserf opened this issue Mar 25, 2024 · 0 comments · Fixed by #6602
Assignees
Milestone

Comments

@cserf
Copy link
Contributor

cserf commented Mar 25, 2024

Description

The replica-recoverer sometimes crashes if it tries to add a rule that already exists :

{"process": {"pid": 1, "name": "suspicious-replica-recoverer", "thread": {"name": "Thread-2", "id": 139697899333184}}, "log": {"level": "CRITICAL", "logger": "root"}, "message": "[1/2]: Exception\nA duplicate rule for this account, did, rse_expression, copies already exists.\nDetails: (cx_Oracle.IntegrityError) ORA-00001: unique constraint (ATLAS_RUCIO.RULES_SCOPE_ACC_NAME_CO_RSE_UQ) violated\n  File \"/usr/local/lib/python3.9/site-packages/rucio/daemons/common.py\", line 216, in _generator\n    result = run_once_fnc(heartbeat_handler=heartbeat_handler, activity=activity)\n  File \"/usr/local/lib/python3.9/site-packages/rucio/daemons/replicarecoverer/suspicious_replica_recoverer.py\", line 364, in run_once\n    add_rule(dids=dids_nattempts_1, account=InternalAccount('root', vo=vo), copies=nattempts, rse_expression='type=SCRATCHDISK', grouping=None, weight=None, lifetime=5 * 24 * 3600, locked=False, subscription_id=None)\n  File \"/usr/local/lib/python3.9/site-packages/rucio/db/sqla/session.py\", line 454, in new_funct\n    result = function(*args, session=session, **kwargs)\n  File \"/usr/local/lib/python3.9/site-packages/rucio/core/rule.py\", line 366, in add_rule\n    raise DuplicateRule(error.args[0]) from error\n", "error": {"type": "DuplicateRule", "message": "A duplicate rule for this account, did, rse_expression, copies already exists.\nDetails: (cx_Oracle.IntegrityError) ORA-00001: unique constraint (ATLAS_RUCIO.RULES_SCOPE_ACC_NAME_CO_RSE_UQ) violated", "stack_trace": "  File \"/usr/local/lib/python3.9/site-packages/rucio/daemons/common.py\", line 216, in _generator\n    result = run_once_fnc(heartbeat_handler=heartbeat_handler, activity=activity)\n  File \"/usr/local/lib/python3.9/site-packages/rucio/daemons/replicarecoverer/suspicious_replica_recoverer.py\", line 364, in run_once\n    add_rule(dids=dids_nattempts_1, account=InternalAccount('root', vo=vo), copies=nattempts, rse_expression='type=SCRATCHDISK', grouping=None, weight=None, lifetime=5 * 24 * 3600, locked=False, subscription_id=None)\n  File \"/usr/local/lib/python3.9/site-packages/rucio/db/sqla/session.py\", line 454, in new_funct\n    result = function(*args, session=session, **kwargs)\n  File \"/usr/local/lib/python3.9/site-packages/rucio/core/rule.py\", line 366, in add_rule\n    raise DuplicateRule(error.args[0]) from error\n"}, "@timestamp": "2024-03-25T08:11:08.355Z"}

A protection needs to be added here : https://github.com/rucio/rucio/blob/master/lib/rucio/daemons/replicarecoverer/suspicious_replica_recoverer.py#L363

Steps to reproduce

Rucio Version

No response

Additional Information

No response

@dchristidis dchristidis added this to the 34.1.0 milestone Mar 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants