-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimise View.asList() side inputs for iterating rather than for indexing. #31087
Merged
Commits on Apr 23, 2024
-
Optimise View.asList() side inputs for iterating rather than for inde…
…xing. The current implementation is, essentially, a distributed hashmap from integer keys to the list contents, mediated by each upstream worker starting at a random value to minimize overlaps and emitting sufficient metadata to map this onto the contiguous range [0, N). This provides optimal *random-access* performance, but very poor *iteration* performance (essentially having to do a key lookup for every advance, and as the keys are hashed and distributed rather than clustered numerically, there is little to no amortiziation in these lookups for adjacent items. Given that most uses for List side inpupts are merely to gather a collection of values (the user has no control over the ordering when materialized) and the high costs of providing random access, this is probably the wrong tradeoff for most pipelines. This is an update-incompatable change and so has been guarded by the update compatibility version flag. The old behavior can be explicilty asked for via a new AsList#withRandomAccess() method.
Configuration menu - View commit details
-
Copy full SHA for b163a54 - Browse repository at this point
Copy the full SHA b163a54View commit details
Commits on Apr 24, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 03fc0c4 - Browse repository at this point
Copy the full SHA 03fc0c4View commit details -
Configuration menu - View commit details
-
Copy full SHA for f38690c - Browse repository at this point
Copy the full SHA f38690cView commit details -
Configuration menu - View commit details
-
Copy full SHA for bf3eae5 - Browse repository at this point
Copy the full SHA bf3eae5View commit details -
Configuration menu - View commit details
-
Copy full SHA for 295b440 - Browse repository at this point
Copy the full SHA 295b440View commit details
Commits on Apr 26, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 9ab2906 - Browse repository at this point
Copy the full SHA 9ab2906View commit details
Commits on Apr 29, 2024
-
Better naming for ListViewFn3, restrict to global windows.
(I kept the name for ListViewFn2 just in case there are pipelines serializing it as data.)
Configuration menu - View commit details
-
Copy full SHA for d64e194 - Browse repository at this point
Copy the full SHA d64e194View commit details -
Configuration menu - View commit details
-
Copy full SHA for 7582e80 - Browse repository at this point
Copy the full SHA 7582e80View commit details
Commits on Apr 30, 2024
-
Configuration menu - View commit details
-
Copy full SHA for 3673ee6 - Browse repository at this point
Copy the full SHA 3673ee6View commit details
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.