Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

flow_job does not prevent multiple execution #176

Open
martinhill opened this issue Mar 18, 2017 · 2 comments
Open

flow_job does not prevent multiple execution #176

martinhill opened this issue Mar 18, 2017 · 2 comments

Comments

@martinhill
Copy link

In version 0.10.x, flow_job could only execute if task.status == STATUS.SCHEDULED. In 0.12, it will execute activation.restart(), which is allowed if task.status == STATUS.STARTED.

As a result of this change, first execution changes task status to STARTED and execute task code. During job execution, task may be called again, but when activation.done() and .activate_next() are called, TransitionNotAllowed exception is raised due to all_leading_cancelled condition. Transaction is rolled back and leaves task.status == STATUS.STARTED

Even though viewflow schedules only one job with celery, mis-configured transport can cause jobs to be repeated. For instance, when using Redis broker with job ETA time longer than visibility_timeout duration.

@martinhill
Copy link
Author

Looking at the code changes again from #154, I think I understand the reason the restart transition was added. However, there is a problem with allowing the task to retry without a state transition from STARTED back to SCHEDULED. As currently implemented, flow_job cannot distinguish between a task being retried and another concurrent execution.

Just to suggest some ideas, changing the status to SCHEDULED could be done by handling the celery.exceptions.Retry exception in flow_job, but this wouldn't work when executed eagerly, and introduce celery-specific code which might be a problem. Another way might be to implement a custom Task subclass and implement Task.on_retry to update the task status.

A quick fix might be to simply remove STATUS.STARTED from restart() source list, but I really don't know what use case it is required to support.

@willseward
Copy link

I certainly see the problem here. I think implementing a Task.on_retry is the cleanest solution.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants