Preventing duplicate entries in Queue. Adding only 1 job to a queue. #102

epicwhale · 2014-07-02T07:24:25Z

This is more of a query than issue.

If I have a Products queue with a list of products that need to be updated using data from a remote source, I want to ensure that no two products get updated at the sametime as it causes some concurrency issues.

What are the recommended solutions to approach this problem?

Make sure my Jobs are concurrency proof (using database locks, etc).. but difficult to achieve with NoSQL.
Ensure that the same Job is not being processed in parallel. (How can I achieve this?)
Ensure that there are no duplicate jobs in the queue. (How can I achieve this?)

Any other solutions, would be appreciated.

danhunsaker · 2014-07-02T07:38:56Z

Option 2 - Set up a dedicated queue for sequential jobs, and assign only one worker to it. Assign all jobs that need to be done sequentially to that queue instead of the default(s). Mission accomplished.

epicwhale · 2014-07-02T11:54:32Z

@danhunsaker I forgot to mention this earlier, but my complication here is that our use-case is that of an e-commerce platform that is syncing products for multiple store's product queues. So new stores can be added automatically and each new store should have its own product queue...

But this has to happen dynamically as we can't re-configure running workers, queue names, etc in supervisor, linux each time we add a store.

danhunsaker · 2014-07-02T17:56:01Z

Does each store need a unique sync queue, or can the application operate with a single sync queue and a separate work queue for each store for other operations?

epicwhale · 2014-07-02T18:01:24Z

@danhunsaker each store doesn't need a unique sync queue as of now for anything except for products syncing. Everything else can be pooled into a common 'default' queue.

I need to make sure that one store's product sync is not delayed because another store having too many items in the queue. Would create a Quality of Service issue. Hence, 1 queue per store for product sync.

Do share your thoughts!

danhunsaker · 2014-07-02T19:00:25Z

My first response to that is virtualization. Each new store spins up a new VM, and anything it needs to do separately from other stores is done there. If I were implementing it, I'd have each store's web interface in its own VM, and possibly separate workers out as well. You'd still have a common worker pool in its own VM, and all the workers would connect to a single Redis instance.

However, that's often beyond the available technology, and it's probably too late in your dev cycle to set things up that way anyhow. So you'll want another approach. Ruby Redis has a plugin that allows certain queues to be marked as sequential when they're created, and then ensures that jobs are not removed from those queues while other workers are processing on them. I haven't looked at its code to see how portable it would be to how PHP Resque operates, but it's a starting point, I think.

epicwhale · 2014-07-02T19:06:39Z

@danhunsaker that does seem like an overkill.. especially since I'm building a SaaS service and want this to scale to a a few 100 and then thousand customers.

I did see the Ruby stuff on serialization in a queue and lock-in.. but looks a tad bit of complication to replicate and manage.. http://www.bignerdranch.com/blog/never-use-resque-for-serial-jobs/ (its almost maintaining another stand-alone project within my project). Don't have the benefit of time there.

Maybe for the products queue, I should be exploring some other alternative? Do you know of any other background or MQ solution that could support this and have a good bundle/library for php/sf2?

mrbase · 2014-07-02T19:13:14Z

@epicwhale i'm currently looking at gearman - which has a bundle - and it's under active development, needs sf 2.4 tho

if it meets your requirements i don't know, but its simple, fast and scales

otherwise, look at http://queues.io/ - here is a fine collection of queue systems

mrbase · 2014-07-02T19:24:00Z

and there is a pecl extension for php as well: http://www.php.net/manual/en/book.gearman.php

danhunsaker · 2014-07-02T19:24:26Z

In my experience, virtualization scales better, and is more secure to boot. But my experience varies wildly from that of many others, who haven't had any problem using such platforms as cPanel and WordPress for all of their needs. I just got tired of one site being able to consume the full resources of my servers, with no reliable way to restrict their activities without affecting anyone else. Also got tired of one hacked site infecting everything on the server. As with anything, your mileage will vary.

Resque wasn't really designed for sequential operation, and making it do it anyway will always be a hack. Even scheduled tasks are a hack, really. So PHP-Resque may not be your best fit. As to alternatives, there are many, and @mrbase has presented some useful starting points. I can't speak to Symphony interop, because I don't use Symfony. To me, SF2 is overkill. :-) I'm sure I'll encounter a project where Symfony makes sense eventually, though.

Best of luck!

mdjaman · 2014-07-24T19:29:47Z

@danhunsaker How to do :
Option 2 - Set up a dedicated queue for sequential jobs, and assign only one worker to it. Assign all jobs that need to be done sequentially to that queue instead of the default(s)

epicwhale · 2014-10-27T05:10:57Z

Why didn't anyone suggest the enqueueOnce(..) function in this bundle? I also noticed that it isn't documented for some reason...

cc: @danhunsaker

danhunsaker · 2014-10-27T10:16:37Z

Possibly because that's not actually what was asked for. It wasn't preventing more than one of a job at a time from being queued. It was preventing more than one of a job at a time from being run. Very different approach, then.

Also, the fact it's undocumented doesn't help.

epicwhale · 2014-10-27T10:30:37Z

Point 3 in the question was Ensure that there are no duplicate jobs in the queue. (How can I achieve this?). I guess this solves that?

Yes, this seems to be a hidden gem. the enqueueOnce(..) function.. has it been tested / used in production?

danhunsaker · 2014-10-27T10:33:19Z

Better to write idempotent jobs, but yeah, that would probably also work.

I honestly don't recall.

danhunsaker · 2014-10-27T14:44:59Z

I take that back. I don't know how much testing enqueueOnce() has gotten, but it's brand new, within the last couple of weeks, so that's why it's neither documented nor mentioned above. It didn't exist yet. Somehow completely forgot working with the contributor on that one.

Hopefully we'll see some documentation on that soon.

darkromz · 2015-06-18T20:03:46Z

@epicwhale i came across this thread after looking for the same thing, preventing duplicate jobs being added to the queue, and then saw your comment about "enqueueOnce" and also also mentioned by your self i can't seem to find anything about it, can you give any examples of code on how to use this.

epicwhale · 2015-06-19T07:18:50Z

@darkromz been long since I've worked with something around this library.. have a look at the function maybe?

BCCResqueBundle/Resque.php

Line 83 in b4dfd5a

public function enqueueOnce(Job $job, $trackStatus = false)

darkromz · 2015-06-19T09:44:42Z

thanks for the quick reply, i will give it a look.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Preventing duplicate entries in Queue. Adding only 1 job to a queue. #102

Preventing duplicate entries in Queue. Adding only 1 job to a queue. #102

epicwhale commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

epicwhale commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

epicwhale commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

epicwhale commented Jul 2, 2014

mrbase commented Jul 2, 2014

mrbase commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

mdjaman commented Jul 24, 2014

epicwhale commented Oct 27, 2014

danhunsaker commented Oct 27, 2014

epicwhale commented Oct 27, 2014

danhunsaker commented Oct 27, 2014

danhunsaker commented Oct 27, 2014

darkromz commented Jun 18, 2015

epicwhale commented Jun 19, 2015

darkromz commented Jun 19, 2015

Preventing duplicate entries in Queue. Adding only 1 job to a queue. #102

Preventing duplicate entries in Queue. Adding only 1 job to a queue. #102

Comments

epicwhale commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

epicwhale commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

epicwhale commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

epicwhale commented Jul 2, 2014

mrbase commented Jul 2, 2014

mrbase commented Jul 2, 2014

danhunsaker commented Jul 2, 2014

mdjaman commented Jul 24, 2014

epicwhale commented Oct 27, 2014

danhunsaker commented Oct 27, 2014

epicwhale commented Oct 27, 2014

danhunsaker commented Oct 27, 2014

danhunsaker commented Oct 27, 2014

darkromz commented Jun 18, 2015

epicwhale commented Jun 19, 2015

darkromz commented Jun 19, 2015