Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix deployment of multiple Batch jobs #2543

Merged

Conversation

aaronegolden
Copy link
Contributor

@aaronegolden aaronegolden commented May 1, 2024

Three issues:

  1. The submit_script_output_path did not depend on the
    ID of the job to be submitted, so one job's submit script could
    overwrite a different job's submit script.

  2. Similarly, the job spec file name did not depend on the submit ID
    of the job, and multiple jobs could overwrite each other if the job
    ID or file name was not set explicitly in the blueprint.

  3. The submit_job resource did not explicitly depend on the submit_script
    resource and as a result would sometimes execute before the script
    was written (causing submission to fail with a file-not-found error).

Submission Checklist

Please take the following actions before submitting this pull request.

  • Fork your PR branch from the Toolkit "develop" branch (not main)
  • Test all changes with pre-commit in a local branch #
  • Confirm that "make tests" passes all tests
  • Add or modify unit tests to cover code changes
  • Ensure that unit test coverage remains above 80%
  • Update all applicable documentation
  • Follow Cloud HPC Toolkit Contribution guidelines #

Three issues:

1. The submit_script_output_path did not depend on the
ID of the job to be submitted, so one job's submit script could
overwrite a different job's submit script.

2. Similarly, the job spec file name did not depend on the submit ID
of the job, and multiple jobs could overwrite each other if the job
ID or file name was not set explicitly in the blueprint.

3. The submit_job resource did not explicitly depend on the submit_script
resource and as a result would sometimes execute before the script
was written (causing submisison to fail with a file-not-found error).
@nick-stroud nick-stroud self-assigned this May 2, 2024
@nick-stroud nick-stroud added release-bugfix Added to release notes under the "Bug fixes" heading. release-module-improvements Added to release notes under the "Module Improvements" heading. and removed release-bugfix Added to release notes under the "Bug fixes" heading. labels May 7, 2024
@nick-stroud
Copy link
Collaborator

/gcbrun

@nick-stroud nick-stroud merged commit ca800ed into GoogleCloudPlatform:develop May 14, 2024
10 of 48 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-module-improvements Added to release notes under the "Module Improvements" heading.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants