Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: run once in shared dependencies #1655

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

fix: run once in shared dependencies #1655

wants to merge 1 commit into from

Conversation

pd93
Copy link
Member

@pd93 pd93 commented May 13, 2024

Fixes #852

This PR changes the hash we use when a task is marked as run:once. Previously we used the full name of the task. This is fine most of the time, but when a task in an included Taskfile is included multiple times, the name will contain a namespace that is different for each task. Using the example given in #852, the hashes produced are:

service-a:build
service-b:build
service-b:library:build #  <-- this...
service-a:library:build #  <-- is different to this

As a result library:build is run twice, even though it is marked as run: once.

In this PR, we are creating a hash that is a combination of the file location and the name of the task without a namespace. This means that the hash is unique for each task, but will be the same when a task is included in different namespaces.

/path/to/project/service-a/Taskfile.yml:build
/path/to/project/service-b/Taskfile.yml:build
/path/to/project/library/Taskfile.yml:library:build  <-- this...
/path/to/project/library/Taskfile.yml:library:build  <-- is the same as this

Note

As discussed below, we're no longer focussing on the second problem in this PR. We're open to further discussion in another issue for anyone coming across this in the future.

#852 mentions a secondary problem:

We also noticed that if we run a "task build" in service-a and then service-b that the .task/ checksum cache is not shared and so the library is built twice.

You can see this illustrated below:

❯ task build   
task: [service-a:library:build] echo "build library" # <-- first run - library builds
build library
task: [service-b:build] echo "build b" # <-- library is run:once so service-b doesn't build it again
task: [service-a:build] echo "build a"
build a
build b
❯ task build
task: Task "service-a:library:build" is up to date # <-- second run - library is up to date so doesn't build
task: Task "service-a:build" is up to date
task: Task "service-b:build" is up to date
❯ task build
task: [service-b:library:build] echo "build library" 
# ^ service-b goes first this time (because the order of deps is non-deterministic)
# but it doesn't share a hash with service-a so it builds the library again
build library
task: Task "service-b:build" is up to date
task: Task "service-a:build" is up to date

We need to determine the correct behaviour:

  1. Working as intended - Tasks should always have independent caches as even shared tasks can have different outputs depending on the context. The fact that the library is built again sometimes is a side effect of the non-deterministic order of the tasks.
  2. Always share a cache between shared tasks - This seems dangerous as variables that are passed into a task can change its output. This might lead to a task not running when it should.
  3. Share a cache when run: once is specified - This still has the same problem as above. However, you could argue that the user has made a mistake by adding run: once when the result of a task could be different depending on the context.

If we choose 2 or 3, then what should the task hash be?

  • <location>:service-a:build (probably not)
  • <location>:service-b:build (also probably not)
  • Some combination of the two
  • Something else entirely?

I would appreciate any thoughts/feedback on these options - or other options entirely.

@vmaerten
Copy link
Collaborator

In my option, the number 2 is really dangerous. I would prefer a task running twice even it's not needed, than a task not running at all when it should

it's a tough topic. I would go for the 1 or 3
I do not have strong opinion on this but I would go for the 1 because there is no risk that the task is not run (even it should be)

@pd93
Copy link
Member Author

pd93 commented May 20, 2024

In my option, the number 2 is really dangerous. I would prefer a task running twice even it's not needed, than a task not running at all when it should

Agreed

it's a tough topic. I would go for the 1 or 3 I do not have strong opinion on this but I would go for the 1 because there is no risk that the task is not run (even it should be)

I've given this more thought and I'm inclined to agree for now. I can see the potential upsides and downsides of option 3, but for now I think leaving this PR as-is and fixing the first issue is a safe bet. I'm still open to discussion about the best way to solve the second problem in another issue.

@pd93 pd93 marked this pull request as ready for review May 20, 2024 21:46
Copy link
Collaborator

@vmaerten vmaerten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about adding a test for this usecase ?

I manage to test it (to ensure it work as expected)

I've added it to the run testdata

Something like this :

library.yml

version: 3

tasks:
  build:
    run: once
    cmds:
      - echo 'Building library...'

service.yml

version: 3

includes:
  library:
      taskfile: library.yml

tasks:
  build:
    deps: [library:build]
    cmds:
      - echo 'Building service {{.NAME}}...'

The root taskfile:

includes:
  service-a:
    taskfile: service.yml
    vars:
      NAME: "service-a"
  service-b:
    taskfile: service.yml
    vars:
      NAME: "service-b"

tasks:
  build:
    silent: false
    cmds:
      - task: service-a:build
      - task: service-b:build

Then we need to verify that Building library... is printed only once.

I can add or you can add, or you can discard my comment. It's your call here :)

Thanks for the work :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make run:once work for dependencies shared amongst included taskfiles
2 participants