Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add aggregate parameters to vega-transform, and exponential moving average #3686

Merged
merged 2 commits into from Sep 19, 2023

Conversation

Xitian9
Copy link
Contributor

@Xitian9 Xitian9 commented Mar 12, 2023

This adds exponential moving averages, fixing #3341. In order to do this, it enables supplying parameters to aggregation functions. This addresses the block on vega/vega-lite#4439.

As I'm not a javascript programmer, feedback will be gratefully taken on board.

@Xitian9
Copy link
Contributor Author

Xitian9 commented Mar 24, 2023

Is there any interest in this?

@domoritz
Copy link
Member

Thanks for the pull request. I just approved the test run again.

@Xitian9
Copy link
Contributor Author

Xitian9 commented Apr 9, 2023

If/when this is merged, #3577 could be made more general by passing the requested quantiles as a parameter.

@shcheklein
Copy link

shcheklein commented May 11, 2023

Hey @Xitian9 thanks for the effort! This is exactly what we are looking for It might be what we are looking for in this case iterative/vscode-dvc#3837 . In ML tracker tools it's very common to use the exponential moving average. I'm still looking into this (and learning along the way if there are other ways to achieve this).

Can it be used to smooth a linear plot or does it calculate a single number?

@domoritz are there any blockers left to merge this? (maybe we can also join and help). Also, if this is merged will it become available automatically in vega-lite?

Copy link
Member

@domoritz domoritz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks reasonable to me. The API question I have is whether it makes sense for the parameter to be a separate property or somehow part of the ops. I guess this is consistent with the rest of the API.

@jheer needs to make a final pass and merge

@@ -11,6 +11,7 @@ import {accessorFields, accessorName, array, error, inherits} from 'vega-util';
* @param {Array<function(object): *>} [params.groupby] - An array of accessors to groupby.
* @param {Array<function(object): *>} [params.fields] - An array of accessors to aggregate.
* @param {Array<string>} [params.ops] - An array of strings indicating aggregation operations.
* @param {Array<object>} [params.aggregate_params=[null]] - An optional array of parameters for aggregation operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to repeat the name aggregate here or is it already obvious because of the context of the aggregate?

Copy link
Contributor Author

@Xitian9 Xitian9 May 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going by the naming agreed upon in vega/vega-lite#4439. We cannot just call it params.params, as this would conflict with the params argument in windowing functions. Anytime you would want to apply an aggregate with parameter over a window with parameter, you would run into problems.

@domoritz domoritz requested a review from jheer May 12, 2023 19:19
@Xitian9
Copy link
Contributor Author

Xitian9 commented May 14, 2023

Can it be used to smooth a linear plot or does it calculate a single number?

The aggregate in itself produces a single number (which is why I call it exponential instead of exponentialmovingaverage or something like that). However, it can be combined with a windowing function to produce a smoothing, which is the use case I have in mind as well.

@domoritz are there any blockers left to merge this? (maybe we can also join and help). Also, if this is merged will it become available automatically in vega-lite?

I don't believe it would automatically be available in vega-lite, and that would need to be updated as well.

@shcheklein
Copy link

@jheer sorry for bothering you, anything we can do (help with) here to unblock this and move forward? Please let me know.

@Xitian9
Copy link
Contributor Author

Xitian9 commented Aug 8, 2023

Is there any appetite for this? If not I'll delete the PR.

@shcheklein
Copy link

@Xitian9 please keep it for a while :) We definitely need this! I can't make decisions here and / or review, but please keep it possible. I hope folks will get some time to merge it. Thanks for your work.

@shcheklein
Copy link

Hey folks, @jheer , any chance we can move it forward? Could we at least identify what is blocking this PR atm?

@jheer jheer merged commit 231a918 into vega:main Sep 19, 2023
3 checks passed
@jheer
Copy link
Member

jheer commented Sep 19, 2023

Thanks for the PR, sorry for the delay in reviewing!

I checked with @jonmmease and expanding aggregates this way does not have negative downstream consequences for VegaFusion (other than that weighted exponential aggregates will not be optimized by VegaFusion at the moment).

@julieg18
Copy link
Contributor

Hello, I'm working on adding aggregate_parameters into vega-lite (vega/vega-lite#9200), but when I tested using the params in a vega editor spec, all I could get was NaN:

image

I tried a couple different field variations but NaN was always the result. Have I misunderstood what aggregate_params is supposed to consist of? Here's the example vega editor spec I was using.

@@ -45,6 +46,7 @@ Aggregate.Definition = {
'params': [
{ 'name': 'groupby', 'type': 'field', 'array': true },
{ 'name': 'ops', 'type': 'enum', 'array': true, 'values': ValidAggregateOps },
{ 'name': 'aggregate_params', 'type': 'field', 'null': true, 'array': true, 'default': [null] },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right. In the docs, we say object and then here the type is field.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@domoritz
Copy link
Member

domoritz commented Dec 14, 2023

@julieg18 hmm, yeah, something is wrong here. The types look wrong to me.

I think the aggregate parameter is just a value, not a field judging from the code but maybe I'm missing something: Open the Chart in the Vega Editor (still fails because of some error somewhere).

@Xitian9 @shcheklein can you post an example you used to test this feature? Also, could you send a follow up pull request to add docs.

@@ -37,6 +38,7 @@ Window.Definition = {
{ 'name': 'groupby', 'type': 'field', 'array': true },
{ 'name': 'ops', 'type': 'enum', 'array': true, 'values': ValidWindowOps.concat(ValidAggregateOps) },
{ 'name': 'params', 'type': 'number', 'null': true, 'array': true },
{ 'name': 'aggregate_params', 'type': 'field', 'null': true, 'array': true, 'default': [null] },
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the default correct here? There may be more than two ops.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code will pad aggregate_params with nulls as necessary to match the number of ops.

@@ -11,6 +11,7 @@ import {accessorFields, accessorName, array, error, inherits} from 'vega-util';
* @param {Array<function(object): *>} [params.groupby] - An array of accessors to groupby.
* @param {Array<function(object): *>} [params.fields] - An array of accessors to aggregate.
* @param {Array<string>} [params.ops] - An array of strings indicating aggregation operations.
* @param {Array<object>} [params.aggregate_params=[null]] - An optional array of parameters for aggregation operations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be [params.aggregate_params] since we cannot assume a single op?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code will pad aggregate_params with nulls as necessary to match the number of ops.

@Xitian9
Copy link
Contributor Author

Xitian9 commented Dec 14, 2023

@Xitian9 @shcheklein can you post an example you used to test this feature? Also, could you send a follow up pull request to add docs.

Test code is here: https://github.com/vega/vega/blob/main/packages/vega-transforms/test/aggregate-test.js

@Xitian9
Copy link
Contributor Author

Xitian9 commented Dec 14, 2023

Hello, I'm working on adding aggregate_parameters into vega-lite (vega/vega-lite#9200), but when I tested using the params in a vega editor spec, all I could get was NaN:
image

I tried a couple different field variations but NaN was always the result. Have I misunderstood what aggregate_params is supposed to consist of? Here's the example vega editor spec I was using.

"exponential" takes a number as an argument, and you have provided the string "param_value". Do you still get NaN if you supply a double?

@Xitian9
Copy link
Contributor Author

Xitian9 commented Dec 14, 2023

@julieg18 hmm, yeah, something is wrong here. The types look wrong to me.

I think the aggregate parameter is just a value, not a field judging from the code but maybe I'm missing something:

I'm not a javascript programmer, so I could definitely believe I got the types wrong. The type needs to be generic enough to accept any sort of value, as different ops will have different argument types. For "exponential" that's a number, but for other as-yet-unimplemented ops it could be something of arbitrary complexity.

@julieg18
Copy link
Contributor

"exponential" takes a number as an argument, and you have provided the string "param_value".

Makes sense, thanks! As previously mentioned, I can only provide strings since currently the type is set to fields.

Do you still get NaN if you supply a double?

Apologies, I'm not sure what you mean by double? Do you mean something like this:

image

I would still get NaN in this case

@domoritz
Copy link
Member

Makes sense, thanks! As previously mentioned, I can only provide strings since currently the type is set to fields.

The editor will only warn you. Numbers are still passed to Vega.

@julieg18
Copy link
Contributor

julieg18 commented Dec 15, 2023

The editor will only warn you. Numbers are still passed to Vega.

Perhaps I'm misunderstanding something, but inputting numbers causes errors to be thrown and nothing get rendered, in the view or data viewer Oh wait, I think I understand what you're saying. You're saying that that those numbers are actually getting passed to vega. These errors aren't just being thrown because of the wrong type, there is something else happening.

image

@julieg18
Copy link
Contributor

julieg18 commented Jan 5, 2024

Types are now fixed thanks to #3846, but as expected, we are still hitting the Unexpected token ']' error when trying to run a spec.

Looking at the error locally, it's possibly happening inside of vega-runtime:

VM823:3 Uncaught SyntaxError: Unexpected token ']'
    at Function (<anonymous>)
    at get (vega-runtime.module.js:142:19)
    at field (vega-util.module.js:93:38)
    at Object.getField2 [as parse] (vega-runtime.module.js:281:36)
    at parseParameter (vega-runtime.module.js:212:16)
    at vega-runtime.module.js:199:51
    at Array.map (<anonymous>)
    at Context.parseParameters (vega-runtime.module.js:199:42)
    at Context.parseOperatorParameters (vega-runtime.module.js:187:48)
    at vega-runtime.module.js:30:34

@Xitian9, are you able to get a working example of aggregate_params in the vega editor?

@julieg18
Copy link
Contributor

Types are now fixed thanks to #3846, but as expected, we are still hitting the Unexpected token ']' error when trying to run a spec.

Actually, I tested aggregate_params today with the latest version of vega in my local vega editor and everything seems to be working!

image

Not sure why I was getting the error previously, but unless it shows up again, I'll get back to vega/vega-lite#9200

@domoritz
Copy link
Member

Awesome. Thanks for the update. I'm glad it's working now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants