Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DISCUSSION] Performance in large catalogs #2

Open
riker09 opened this issue Dec 2, 2014 · 9 comments
Open

[DISCUSSION] Performance in large catalogs #2

riker09 opened this issue Dec 2, 2014 · 9 comments

Comments

@riker09
Copy link

riker09 commented Dec 2, 2014

First things first: Very nice module. Cleanly written, no core-rewrites, very useful function. But I'm wondering how well this module will perform on large catalogs.

As a demo I installed in a development site with 3.500 skus and it looks like the prefetched JSON already exceeds the local storage limit. The generated json data cannot be inspected fully with Chrome developer tools and is cut off

bla, bla json code [...]html","price":"610.0000","tax_class_id":"1",

When saving it to a text file it weighs in at 750 kB. I'm not sure what the size of my localStorage cache currently is. Since I never changed the setting I'm guessing it should be the default of 5 MB.

Now imagine a catalog with hundreds of thousands of products. There should be at least a warning for the store owner that the use of this module may have a negative impact on the performance (Magento backend, GitHub Wiki page, Readme.md file).

One way would be to reduce the amount of data stored as JSON. Consider this

{
  "status":"1",
  "entity_id":"1",
  "type_id":"simple",
  "attribute_set_id":"31",
  "name":"BLANCOAXIA II 6 S",
  "url_path":"blancoaxia-ii-6-s.html",
  "price":"334.0000",
  "tax_class_id":"1",
  "final_price":"334.0000",
  "minimal_price":"334.0000",
  "min_price":"334.0000",
  "max_price":"334.0000",
  "tier_price":null,
  "cat_index_position":"50016"
}

versus this

{
  "name":"BLANCOAXIA II 6 S",
  "url_path":"blancoaxia-ii-6-s.html",
  "price":"334.0000",
  "final_price":"334.0000",
}
@jreinke
Copy link
Owner

jreinke commented Dec 2, 2014

Thanks for your feedback.

Indeed, this part of code could be optimized to keep only interesting fields. This is something you can already do by observing the event bubble_autocomplete_product_collection_init. Then, you'll be able to customize the collection.
Also, you can disable local storage from Magento Admin if you have a too large catalog.

@riker09
Copy link
Author

riker09 commented Dec 3, 2014

I don't think disabling local storage results in large benefits. After all, the local storage acts as a cache and prevents that the product json data is pulled from the server on each request! I cannot think of a reason to not having this option enabled.

@Flyingmana
Copy link

I think it would make sense to only use searchable atributes + some default ones, this would reduce the amount already strongly and gives others over this or another event a way to add additional ones?

@Chris25602
Copy link

Glad this thread is here...my store has 100k+ products. Thanks for watching out.

@winkelsdorf
Copy link

@Flyingmana 👍 Agreed, probably best to initially fetch only

name, thumbnail, url_path, type_id, min_price, final_price, price

At least that's what is currently used by js.phtml.

I'd been testing this but for some reason I'd been unable to filter the getCollection() call.

@winkelsdorf
Copy link

Better idea: Probably best to split this into two json queries. One for e.g. the 100 newest or best selling products and one dynamic query which sends the search term via ajax to the backend and returns a limited live result json (5 hits, 10 hits as defined in the backend).

The queries can also be cached if wanted. Drawback: Either the cache might grow depending on the amount of used search terms (= jsons returned) or the live db queries might result in heavy db server load until db is cached/flat product category is used.

@Chris25602 But anyway, sounds better than delivering 100k+ products json with each 1st visit of the page or if the cache ttl is reached.

What do the others think? I believe with a limit of e.g. 5 products the queries and returned jsons are very small and fast..

Edit: What I mean can be seen here:
https://twitter.github.io/typeahead.js/examples/#remote

Edit2: Together with the remote debounced ajax request already implemented in Bloodhound this should lower the requests and db queries a lot (for difference between debounce/throttle see example at http://benalman.com/code/projects/jquery-throttle-debounce/examples/debounce/).

@winkelsdorf
Copy link

I implemented the suggetions I made in winkelsdorf@f6b4df8

Feel free to test and report your experiences. As of now the magento caching is not used at all anymore. Twitter's Typeahead & Bloodhound already provides a debounce/throttling and rateLimitWait to wait with ajax requests (default 300ms).

Integrating a correct caching would require a lot more work than before: Cache the prefetched Product List, Invalidate on Product changes, probably cache the smaller ajax requests and invalidate all of them on demand. I don't see how this can be easily implemented with what magento provides..

@Chris25602 Curious how this extension would perform on your store now! As the searches always fetch a limited amount of rows (one prefetch limit, default 100 - and one for ajax live search defaulting to 5) this should be no issue anymore to run this on a store with 10k or even 100k+ products.

Just the db load with many concurrent users should be something that probably needs observation and a reimplementation of a caching mechanism (magento, varnish).

Cheers,
Frederik

@winkelsdorf
Copy link

@riker09 Initial proposal to include a reduced attribute set finished: winkelsdorf@ce312f3

@Chris25602
Copy link

@winkelsdorf Hello I haven't had a chance to jump back in here for a while but, testing this is next up whenever I have a slowday and I will be sure to come back and let you know how it goes /tldr --TODO

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants