Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(worker): Parallel execution #58

Open
checkphi opened this issue Apr 15, 2021 · 13 comments · Fixed by #235
Open

feat(worker): Parallel execution #58

checkphi opened this issue Apr 15, 2021 · 13 comments · Fixed by #235

Comments

@checkphi
Copy link

Currently it seems like all scheduled jobs are executed sequentially.
Would it be possible to leverage the messenger system somehow so that each job (or a defined number of jobs) can run in parallel?

@Guikingone
Copy link
Owner

Hi @checkphi 👋🏻

Yes, all tasks are executed sequentially, to be honest, Messenger cannot natively consume messages in parallel so IMHO, there's no benefit of working on something related to Messenger.

I have plans to work on something that can execute tasks in parallel thanks to Fiber in PHP 8.1 but as it cannot be implemented in 8.0, it force me to enable it only for >=8.1 🙁

What do you have in mind to bring parallel execution?

@checkphi
Copy link
Author

Maybe I understood the documentation and concept of messenger wrong, but is it not possible to run multiple messenger:consume workers at the same time? Then whichever is "free" will work on that.

In other words, if we start 3 additional "scheduler workers" then we could process 3 scheduled jobs in parallel?

@Guikingone
Copy link
Owner

Thanks to Supervisord, you can trigger multiple worker (same concept applies to this bundle), each worker will consume messages available at the moment.

In other words, if we start 3 additional "scheduler workers" then we could process 3 scheduled jobs in parallel?

In theory, yes, keep in mind that a lock is applied to every task before executing it so it may depends on the transport which is used (Doctrine use a resource lock, same things applies to Redis).

@checkphi
Copy link
Author

So you say that I should simple start multiple scheduler:consume --wait workers which would then almost result in the desired behaviour?

@Guikingone
Copy link
Owner

Yes, for now, I don't have an other solution sadly 🙁

If you have one, feel free to contribute and we'll see if it can be integrated.

@checkphi
Copy link
Author

I'm thinking about multiple solutions which I would be happy to discuss.
Can I contact you somehow? I think the discussions would exceed the scope of this issue.

@Guikingone
Copy link
Owner

Hi @checkphi 👋🏻

Consider using discussions: https://github.com/Guikingone/SchedulerBundle/discussions

This way, we can track the discussion and open related issues if required 🙂

@checkphi
Copy link
Author

#65

@Guikingone Guikingone linked a pull request May 6, 2021 that will close this issue
@Guikingone Guikingone self-assigned this May 7, 2021
@Guikingone Guikingone changed the title Parallel execution feat(worker): Parallel execution May 12, 2021
@Guikingone Guikingone removed a link to a pull request Jun 17, 2021
@Guikingone Guikingone linked a pull request Jun 17, 2021 that will close this issue
@grimgit
Copy link

grimgit commented Mar 21, 2022

Hi @Guikingone ,
I'm using Doctrine for transport and store, so i would to consume multiple tasks calling multiple scheduler:consume at the same time but the first process locks all the tasks so the next scheduler:consume process can't consume anything.
How I can configure the scheduler to lock only the consumed task?

@Guikingone
Copy link
Owner

Hi @grimgit 👋🏻

Actually, the call to scheduler:consume lock all the tasks that can be locked: https://github.com/Guikingone/SchedulerBundle/blob/main/src/Worker/AbstractWorker.php#L200.

How I can configure the scheduler to lock only the consumed task?

What's the idea behind this approach? If I'm right, locking the consume tasks isn't useful as they're "consumed" and not retrieved until next minute 🤔

@grimgit
Copy link

grimgit commented Mar 22, 2022

Suppose to have a very long task, for example 1 hour task, all the others tasks are locked and will be delayed of 1 hour, even if i'm running multiple consumer.
Instead if the consumer is able to lock only the running task, others consumers can lock and consume the others available tasks.

@Guikingone
Copy link
Owner

Suppose to have a very long task, for example 1 hour task, all the others tasks are locked and will be delayed of 1 hour, even if i'm running multiple consumer.

Actually, tasks are locked "per processes" depending on the store that you're using.

Here's the explanation of what happen internally:

  • Process A call scheduler:consume, it lock all the tasks that can be locked at the moment.
  • Process B call scheduler:consume, if tasks are available (or due for the correct wording), it consumes them and stop (you can wait for tasks using --wait option).

If the process B detect that tasks are locked, it cannot unlock them and wait until they're released by A, so, by definition, we can say that B consume tasks one by one and that A lock them one by one, I agree on the fact that we lock tasks one by one but at the same time 🙁

Instead if the consumer is able to lock only the running task, others consumers can lock and consume the others available tasks.

I think I get the point that you're mentioning (stop me if I'm wrong):

  • Process A call scheduler:consume, it lock the task one by one if it can acquire them.
  • Process B call scheduler:consume, if tasks are available (or due for the correct wording), it consumes them one by one by locking them and stop (you can wait for tasks using --wait option).

Am I right?

If I am, yes, it could be improved, feel free to submit a PR to improve it, we can discuss about the implementation / improvements 🙂

@grimgit
Copy link

grimgit commented Mar 22, 2022

It's correct, process A should lock only the first available task before start to consume it so the process B can acquire the lock of another task.
Suppose to have two tasks, Task 1 a time consuming task that can run for an hour and Task 2 a high frequency task, as could be a mail queue processor that should run every minute. With the current lock design the Task 2 can't be consumed by another process while Task 1 is running because the task iìs locked by the first process that is consuming the Task 1 so all emails will be delayed .

So the consumer process should:

  • While there is an available task
    • Acquire the task lock
    • Consume the task
    • Release the task lock

@Guikingone Guikingone linked a pull request Apr 8, 2022 that will close this issue
@Guikingone Guikingone reopened this Apr 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants