Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add syntax to be able to reduce over multiple windows #19

Open
ilyanoskov opened this issue Mar 17, 2024 · 3 comments
Open

Add syntax to be able to reduce over multiple windows #19

ilyanoskov opened this issue Mar 17, 2024 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@ilyanoskov
Copy link

I wonder if it's possible to have some additional syntax to make working with many windows possible?

For example, with ClickHouse SQL it's possible to work across many windows like this:

WITH sales AS (
    -- Your dataset source here
)
SELECT
    date,
    salesperson_id,
    region,
    amount,
    product_id,
    ROW_NUMBER() OVER w_region_amount AS row_number_region,
    RANK() OVER w_salesperson_amount AS rank_salesperson,
    DENSE_RANK() OVER w_product_amount AS dense_rank_product,
    SUM(amount) OVER w_region AS sum_sales_region,
    AVG(amount) OVER w_region AS avg_sales_region,
    MAX(amount) OVER w_salesperson AS max_sales_salesperson,
    MIN(amount) OVER w_product AS min_sales_product,
    LEAD(amount, 1) OVER w_salesperson_date AS lead_amount,
    LAG(amount, 1) OVER w_salesperson_date AS lag_amount,
    NTILE(10) OVER w_global_amount AS decile_rank_by_amount,
    FIRST_VALUE(salesperson_id) OVER w_region_amount AS top_salesperson_region,
    LAST_VALUE(salesperson_id) OVER w_region_amount_rows AS last_salesperson_region,
    COUNT(*) OVER w_region AS count_sales_region,
    PERCENT_RANK() OVER w_region_amount AS percent_rank_region,
    CUME_DIST() OVER w_region_amount AS cume_dist_region
FROM sales
WINDOW
    w_region AS (PARTITION BY region),
    w_salesperson AS (PARTITION BY salesperson_id),
    w_product AS (PARTITION BY product_id),
    w_region_amount AS (PARTITION BY region ORDER BY amount DESC),
    w_salesperson_amount AS (PARTITION BY salesperson_id ORDER BY amount DESC),
    w_product_amount AS (PARTITION BY product_id ORDER BY amount DESC),
    w_salesperson_date AS (PARTITION BY salesperson_id ORDER BY date),
    w_global_amount AS (ORDER BY amount DESC),
    w_region_amount_rows AS (PARTITION BY region ORDER BY amount DESC ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING)
ORDER BY region, amount DESC;

With Pathway, I found myself having to define each window as a separate table, and then joining them back together. The resulting code was quite verbose.

Unless I am missing something and it's possible to do it succinctly with Pathway?

@ilyanoskov ilyanoskov added the enhancement New feature or request label Mar 17, 2024
@dxtrous
Copy link
Member

dxtrous commented Mar 17, 2024

Defining each window as a separate table (with groupby) is probably a good idea. As for joining it all back, it is convenient to use ix_ref - here is an example in the SQL-window spirit to compare the salary of an employee to the average salary in their position & department: https://pathway.com/developers/user-guide/data-transformation/indexing-grouped-tables/#multi-values-indexing.
Calling in @izulin - maybe you know of something cleaner. Either way, it would help to explain that this is the recommended way to implement SQL-windows (not to be confused with streaming windows).

@ilyanoskov
Copy link
Author

Thanks a lot @dxtrous, will give it a try!

@izulin
Copy link
Contributor

izulin commented Mar 20, 2024

With Pathway, I found myself having to define each window as a separate table, and then joining them back together. The resulting code was quite verbose.

Unless I am missing something and it's possible to do it succinctly with Pathway?

Hi Ilya,

could you post an example of a verbose/unwieldy pathway code here? This could be a great starting point on how to extend syntax and/or tutorials.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants