Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[R] Simplify arrow_eval() logic and bindings environments #41540

Closed
nealrichardson opened this issue May 5, 2024 · 0 comments · Fixed by #41537
Closed

[R] Simplify arrow_eval() logic and bindings environments #41540

nealrichardson opened this issue May 5, 2024 · 0 comments · Fixed by #41537

Comments

@nealrichardson
Copy link
Member

Describe the enhancement requested

Following #41223 and #41350, I see some way to make things simpler.

Component(s)

R

@nealrichardson nealrichardson added this to the 17.0.0 milestone May 7, 2024
nealrichardson added a commit that referenced this issue May 7, 2024
…41537)

### Rationale for this change

NSE is hard enough. I wanted to see if I could remove some layers of
complexity.

### What changes are included in this PR?

* There no longer are separate collections of `agg_funcs` and
`nse_funcs`. Now that the aggregation functions return Expressions
(#41223), there's no reason to treat
them separately. All bindings return Expressions now.
* Both are removed and functions are just stored in `.cache$functions`.
There was a note wondering why both `nse_funcs` and that needed to
exist. They don't.
* `arrow_mask()` no longer has an `aggregations` argument: agg functions
are always present.
* Because agg functions are always present, `filter` and `arrange` now
have to check for whether the expressions passed to them contain
aggregations--this is supported in regular dplyr but we have deferred
supporting it here for now (see
#41350). If we decide we want to
support it later, these checks are the entry points where we'd drop in
the `left_join()` as in `mutate()`.
* The logic of evaluating expresssions in `filter()` has been
simplified.
* Assorted other cleanups: `register_binding()` has two fewer arguments,
for example, and the duplicate functions for referencing agg_funcs are
gone.

There is one more refactor I intend to pursue, and that's to rework
abandon_ship and how arrow_eval does error handling, but I ~may~ will
defer that to a followup.

### Are these changes tested?

Yes, though I'll add some more for filter/aggregate in the followup
since I'm reworking things there.

### Are there any user-facing changes?

There are a couple of edge cases where the error message will change
subtly. For example, if you supplied a comma-separated list of filter
expressions, and more than one of them did not evaluate, previously you
would be informed of all of the failures; now, we error on the first
one. I don't think this is concerning.
* GitHub Issue: #41540
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant