Skip to content

Commit

Permalink
Fix deployment of multiple Batch jobs
Browse files Browse the repository at this point in the history
Three issues:

1. The submit_script_output_path did not depend on the
ID of the job to be submitted, so one job's submit script could
overwrite a different job's submit script.

2. Similarly, the job spec file name did not depend on the submit ID
of the job, and multiple jobs could overwrite each other if the job
ID or file name was not set explicitly in the blueprint.

3. The submit_job resource did not explicitly depend on the submit_script
resource and as a result would sometimes execute before the script
was written (causing submisison to fail with a file-not-found error).
  • Loading branch information
aaronegolden committed May 1, 2024
1 parent 07ccd70 commit e2a6cb4
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions modules/scheduler/batch-job-template/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,7 @@ locals {

job_id_base = coalesce(var.job_id, var.deployment_name)
submit_job_id = "${local.job_id_base}-${random_id.submit_job_suffix.hex}"
job_filename = coalesce(var.job_filename, "cloud-batch-${local.job_id_base}.yaml")
job_filename = coalesce(var.job_filename, "cloud-batch-${local.submit_job_id}.yaml")
job_template_output_path = "${path.root}/${local.job_filename}"

submit_script_contents = templatefile(
Expand All @@ -54,7 +54,7 @@ locals {
submit_job_id = local.submit_job_id
}
)
submit_script_output_path = "${path.root}/submit-job.sh"
submit_script_output_path = "${path.root}/submit-${local.submit_job_id}.sh"

subnetwork_name = var.subnetwork != null ? var.subnetwork.name : "default"
subnetwork_project = var.subnetwork != null ? var.subnetwork.project : var.project_id
Expand Down Expand Up @@ -117,7 +117,7 @@ resource "local_file" "submit_script" {
}

resource "null_resource" "submit_job" {
depends_on = [local_file.job_template]
depends_on = [local_file.job_template, local_file.submit_script]
count = var.submit ? 1 : 0

# A new deployment should always submit a new job. Old finished jobs aren't persistent parts of
Expand Down

0 comments on commit e2a6cb4

Please sign in to comment.