Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A performance exploration of the stock lots route #7295

Open
jniles opened this issue Oct 9, 2023 · 0 comments
Open

A performance exploration of the stock lots route #7295

jniles opened this issue Oct 9, 2023 · 0 comments

Comments

@jniles
Copy link
Contributor

jniles commented Oct 9, 2023

This issue is similar to other stock performance issues: #6027, #6922, #6659.

On my machine, I did some tests to figure out how to improve the performance of the stock registries. I'd like to share my findings in this issue. All these tests were run on a machine with the following specification:

Processor RAM Disk
Intel(R) Core(TM) i5-8500 CPU @ 3.00GHz 7.63GiB 476.9GB nvme0n1

First, I accessed the stock lots registry using the latest Vanga database. This database has 533905 records in the stock_movement table. By increasing the limit to 10000, I was able to render this page in 1.78 minutes (106598.164 ms).

Screenshot 2023-10-09 at 4 24 54 PM

This is nearly 2 minutes waiting for the page! Inspecting the server logs showed that the majority of the time was spent in executing this HTTP query:

  http ::ffff:100.118.76.15 - GET /stock/lots/depots?client_timestamp=2023-10-09T21:01:14.290Z&displayValues=&includeEmptyLot=0&limit=10000 HTTP/1.1 200 764174 - 106598.164 ms +2m

Which, in turn, calls this SQL query:

  SELECT BUID(l.uuid) AS uuid, l.label, l.description AS lot_description,
       SUM(m.quantity * IF(m.is_exit = 1, -1, 1)) AS quantity,
         SUM(m.quantity) AS mvt_quantity,
         d.text AS depot_text, l.unit_cost, l.expiration_date,
         l.serial_number, l.reference_number, l.package_size,
        d.min_months_security_stock, d.default_purchase_interval,
        DATEDIFF(l.expiration_date, CURRENT_DATE()) AS lifetime,
        BUID(l.inventory_uuid) AS inventory_uuid,
        i.code, i.text, BUID(m.depot_uuid) AS depot_uuid,
        i.is_asset, i.manufacturer_brand, i.manufacturer_model,
        m.date AS entry_date, i.purchase_interval, i.delay, i.is_count_per_container,
        IF(ISNULL(iu.token), iu.text, CONCAT("INVENTORY.UNITS.",iu.token,".TEXT")) AS unit_type,
        ig.name AS group_name, ig.tracking_expiration, ig.tracking_consumption,
        dm.text AS documentReference, t.name AS tag_name, t.color, sv.wac,
        CONCAT('LT', LEFT(HEX(l.uuid), 8)) AS barcode
 
      FROM stock_movement m
        JOIN lot l ON l.uuid = m.lot_uuid
        JOIN inventory i ON i.uuid = l.inventory_uuid
        JOIN inventory_unit iu ON iu.id = i.unit_id
        JOIN inventory_group ig ON ig.uuid = i.group_uuid
        JOIN depot d ON d.uuid = m.depot_uuid
        JOIN stock_value sv ON sv.inventory_uuid = i.uuid
        LEFT JOIN document_map dm ON dm.uuid = m.document_uuid
        LEFT JOIN lot_tag lt ON lt.lot_uuid = l.uuid
        LEFT JOIN tags t ON t.uuid = lt.tag_uuid
   WHERE 1  GROUP BY l.uuid, m.depot_uuid HAVING quantity > 0 ORDER BY i.code, l.label   LIMIT 10000 

Next, I entered the MySQL command line and executed this query on its own. It took 10.759 seconds. This is part of the issue, but it fires of a cascading series of GetAMC() calls and lot tag lookups. We could probably get an easy win by adding "skipTags" to the query if tags aren't in the search parameter.

I was curious to see how quick I could get the "bare minimum" information from the stock_movement table. So, I composed the following SQL query:

SELECT 
    sm.lot_uuid,
    sm.depot_uuid,
    SUM(CASE WHEN sm.is_exit = 0 THEN sm.quantity ELSE -sm.quantity END) AS quantity_in_stock
FROM 
    stock_movement sm
GROUP BY 
    sm.depot_uuid, sm.lot_uuid
HAVING 
    quantity_in_stock > 0;

This query returned the correct information (422 rows, I spot-checked the results) in 0.405 seconds, which demonstrates that the problem isn't really the volume of stock transactions, but inefficient JOINs.

So... I rewrote the query, attempting to replicate the exact same behavior, but making it as fast as possible. To do this, I used MySQL 8's WITH statement (also know as Common Table Expressions). They allowed me to move the heavy SUM/GROUP BY calculation into its own WITH statement. Here is the final query:

WITH LotBalances AS (
    SELECT 
        sm.lot_uuid,
        sm.depot_uuid,
        SUM(CASE WHEN sm.is_exit = 0 THEN sm.quantity ELSE -sm.quantity END) AS quantity_in_stock,
        MIN(sm.date) AS entry_date
    FROM 
        stock_movement sm
    GROUP BY 
        sm.lot_uuid, sm.depot_uuid
    HAVING 
        quantity_in_stock > 0
)

SELECT BUID(l.uuid) as uuid,
  l.label, l.description AS lot_description,
  LB.quantity_in_stock as quantity,
  d.text AS depot_text, l.unit_cost, l.expiration_date,
  l.serial_number, l.reference_number, l.package_size,
  d.min_months_security_stock, d.default_purchase_interval,
  DATEDIFF(l.expiration_date, CURRENT_DATE()) AS lifetime,
  BUID(l.inventory_uuid) AS inventory_uuid,
  i.code, i.text, BUID(LB.depot_uuid) AS depot_uuid,
  LB.entry_date, i.is_asset, i.manufacturer_brand, i.manufacturer_model,
  i.purchase_interval, i.delay, i.is_count_per_container,
  IF(ISNULL(iu.token), iu.text, CONCAT("INVENTORY.UNITS.",iu.token,".TEXT")) AS unit_type,
  ig.name AS group_name, ig.tracking_expiration, ig.tracking_consumption,
  t.name AS tag_name, t.color, sv.wac,
  CONCAT('LT', LEFT(HEX(l.uuid), 8)) AS barcode
FROM LotBalances LB
        JOIN lot l ON l.uuid = LB.lot_uuid
        JOIN inventory i ON i.uuid = l.inventory_uuid
        JOIN inventory_unit iu ON iu.id = i.unit_id
        JOIN inventory_group ig ON ig.uuid = i.group_uuid
        JOIN depot d ON d.uuid = LB.depot_uuid
        JOIN stock_value sv ON sv.inventory_uuid = i.uuid
        LEFT JOIN lot_tag lt ON lt.lot_uuid = l.uuid
        LEFT JOIN tags t ON t.uuid = lt.tag_uuid
 ORDER BY i.code, l.label  LIMIT 10000 ;

This is almost identical, but not quite. There are two changes I made:

  1. I removed the document_map join because this query doesn't deal with document mappings at all! I'm not sure why it's included here, except that it is probably used by a totally different route for a different functionality.
  2. I removed the mvt.quantity since it wouldn't ever show up anyway (we are grouping over movements - I'm not sure what this number should be). I also changed the logic for entry_date to ensure that it is the earliest date a lot was used.

With these changes, I ran the query again which produced 422 rows in set (0.438 sec). From 10.7 seconds to 0.438 seconds is a 95.929% performance improvement.

My analysis is ignoring the WHERE functionality, but I imagine it would not be too difficult to implement filtering on this format of query.


At this point, our stock routes have gotten too difficult for me to rationalize. I think we need to start going back to keeping things simple, and seek to improve performance. To do so, we would need to document the consumers of each API route and make sure we don't break them. I'm not sure what routes the BHIMA mobile app is using, nor what routes BAO is using for DHIS2. If we could document these, along with the ones BHIMA uses internally, we could start to move to a better designed stock routes system.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant