You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the GROUPS window frame specification is implemented in QuestDB but lacks support for aggregation functions like AVG and SUM. As discussed, this feature aligns with PostgreSQL semantics, where GROUPS provides a unique way to define window frames based on logical groups rather than individual rows.
Problem Statement
In its current state, the GROUPS clause is implemented but lacks support for aggregation functions such as AVG and SUM, which limits its usefulness. Without this support, the GROUPS keyword only complicates the code without offering any unique benefits over ROWS.
Expected Behavior
When the GROUPS window frame specification is used, it should support aggregate functions to provide meaningful results.
References
Slack conversation regarding GROUPS support: [link to conversation]
SELECT*,
SUM(score) OVER (
ORDER BY age
GROUPS BETWEEN 1 PRECEDING AND1 FOLLOWING
) AS sum_groups
FROM user_scores
ORDER BY age;
Expected Output:
| participant | age | score | sum_groups |
|-------------|-----|-------|------------|
| e | 33 | 5 | 52 |
| c | 34 | 47 | 113 |
| a | 40 | 50 | 153 |
| f | 40 | 11 | 153 |
| b | 50 | 35 | 106 |
| d | 50 | 10 | 106 |
Comparison with RANGE and ROWS
To understand how GROUPS compares to other window frame specifications, here are the equivalent queries using RANGE and ROWS.
Query Using RANGE:
SELECT*,
SUM(score) OVER (
ORDER BY age
RANGE BETWEEN 1 PRECEDING AND1 FOLLOWING
) AS sum_range
FROM user_scores
ORDER BY age;
Expected Output:
| participant | age | score | sum_range |
|-------------|-----|-------|-----------|
| e | 33 | 5 | 52 |
| c | 34 | 47 | 52 |
| a | 40 | 50 | 61 |
| f | 40 | 11 | 61 |
| b | 50 | 35 | 45 |
| d | 50 | 10 | 45 |
Query Using ROWS:
SELECT*,
SUM(score) OVER (
ORDER BY age
ROWS BETWEEN 1 PRECEDING AND1 FOLLOWING
) AS sum_rows
FROM user_scores
ORDER BY age;
Expected Output:
| participant | age | score | sum_rows |
|-------------|-----|-------|----------|
| e | 33 | 5 | 52 |
| c | 34 | 47 | 87 |
| a | 40 | 50 | 108 |
| f | 40 | 11 | 96 |
| b | 50 | 35 | 96 |
| d | 50 | 10 | 45 |
The explanation here is…
When using ROWS, it is very straight forward. I look before/after as many rows as I am told.
When using RANGE, I look at as many rows before/after as the range says, but.. interestingly… the RANGE needs to be consecutive. So im the case of rows for age 33 and 34 it is adding them together, as 33 and 34 are consecutive integers, but when aggregating for rows with ages 40 and 50 it is not passing the boundaries as the difference between 35 and 40, or between 40 and 50 is higher than 1
When using GROUPS, it is adding to the frame all the score values for all the columns in the group before and after, so for columns in the age 50, it will be adding the scores of the two values in the 40 group. For the rows in group 40, it is adding all the values for group 34 and group 50
Conclusion
GROUPS allows logical grouping based on ordered values, offering more meaningful aggregate calculations compared to ROWS and RANGE in some use cases.
The current implementation should be enhanced to support aggregation functions like AVG and SUM fully.
Is your feature request related to a problem?
Background
Currently, the
GROUPS
window frame specification is implemented in QuestDB but lacks support for aggregation functions likeAVG
andSUM
. As discussed, this feature aligns with PostgreSQL semantics, whereGROUPS
provides a unique way to define window frames based on logical groups rather than individual rows.Problem Statement
In its current state, the
GROUPS
clause is implemented but lacks support for aggregation functions such asAVG
andSUM
, which limits its usefulness. Without this support, theGROUPS
keyword only complicates the code without offering any unique benefits overROWS
.Expected Behavior
When the
GROUPS
window frame specification is used, it should support aggregate functions to provide meaningful results.References
GROUPS
support: [link to conversation]Describe the solution you'd like.
Example Implementation
Consider the following data for
user_scores
:Query Using
GROUPS
:Expected Output:
Comparison with
RANGE
andROWS
To understand how
GROUPS
compares to other window frame specifications, here are the equivalent queries usingRANGE
andROWS
.Query Using
RANGE
:Expected Output:
Query Using
ROWS
:Expected Output:
The explanation here is…
Conclusion
GROUPS
allows logical grouping based on ordered values, offering more meaningful aggregate calculations compared toROWS
andRANGE
in some use cases.AVG
andSUM
fully.Describe alternatives you've considered.
No response
Full Name:
Javier Ramirez
Affiliation:
QuestDB
Additional context
This issue was first raised on public slack https://questdb.slack.com/archives/C0394MS0FMF/p1714323610008919
The text was updated successfully, but these errors were encountered: