Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Partitioning based on ROIs #764

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open

Conversation

sk1p
Copy link
Member

@sk1p sk1p commented Apr 29, 2020

Closes #651 - this is still only a simple sketch of the API, although the basic functionality is already there.

Refs Ptychography-4-0/ptychography#6

Still to do:

  • The actual implementation needs changes in at least the UDFRunner, which depends on having a slice associated to the Partition.
  • Maybe add Partition.shape_for_roi(roi) method, which can override the current partition_shape=partition.slice.adjust_for_roi(roi).shape used when constructing the UDFMeta object, which can then also be used in the short-circuit logic that decides not to generate tasks for empty ROIs
  • "nested" ROI support - you should still be able to pass a roi argument to run_udf on a RoiDataset. Needs tests for the case that the Partition ROIs actually overlap, and the roi selects elements from the overlapping region, etc.
  • Implement DataSet.meta property on all DataSet impls
  • Optimizations for reading a single "macrotile" from multiple partitions
  • Documentation

Question about partition sizing: as the user may have requirements for overlapping etc. we don't have control over the size and number of partitions, as we can't use generic logic here. I guess this is the question: where do the partition ROIs come from? Should the user pass in a callable that generates the ROIs? For example, the decisions could depend on the number of GPUs available on the system, and the available memory, so this is actually a per-node decision. I guess the size of each ROI should fit the memory of the "smallest" GPU, for the edge case of having more than one GPU, but with different memory sizes.

Contributor Checklist:

@sk1p sk1p added this to the 0.6 milestone Apr 29, 2020
@sk1p sk1p requested a review from uellue April 29, 2020 16:49
Copy link
Member

@uellue uellue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that's a really small change for such a feature!

Interface looks good, we'll see how it handles when implementing algorithms. 👍

@sk1p
Copy link
Member Author

sk1p commented Apr 30, 2020

So, in numpy, once you apply fancy indexing, the affected dimensions are collapsed into one flat dimension, which is "compressed" to the number of nonzero entries. This is also how I currently handle the "normal" ROI.

When dealing with overlapping ROIs, assigning coordinates to tiles becomes a bit more difficult. Just passing on the coordinates from the wrapped dataset (as in the API sketch in this PR) does not work, as the coordinates would all start at zero. My proposal here is to also flatten the navigation axis of the whole RoiDataSet, and have the navigation axis expand, so its navigation shape would be basically sum(np.count_nonzero(roi) for roi in rois). The origin of each partition would be determined by the number of non-zero entries in the preceding ROIs in the list.

@uellue
Copy link
Member

uellue commented Apr 30, 2020

Discussion with @sk1p on behavior (WIP):

Two options to implement the navigation dimension:

  1. If the ROI partitions don't overlap and cover the entire navigation space, the interface should work with all existing UDFs and yield results as if it was running with default partitions (keep the navigation dimension of the original dataset).
  2. The ROI partitions give results in a new navigation dimension, which is a concatenation of all the ROIs.

In the second case, we need tools to translate between the coordinates in the new navigation dimension and the navigation dimension of the underlying dataset.

As a comment, the automatisms of kind="nav" result buffers will probably not be too useful for ROI partitions since ROI partitions are introduced for use cases where normal 1:1 merging isn't sufficient. For that reason, kind="single" result buffers with suitable extra_shape will likely be used often.

  • Running an UDF with ROI partitions yields live-updating results as the partition results come in. That means a usable result can not only be generated in a final postprocessing step, but there are ways to generate them at every merging step. Refs Allow final postprocessing on the central node in an UDF #628

  • The merge function needs information on the ROI of the current incoming partition result.

  • If an UDF wants to keep the partition results around instead of merging and discarding them, it can do so in appropriate result buffers, for example a list (dtype=object)

@sk1p sk1p modified the milestones: 0.6, 0.7 Jun 10, 2020
@sk1p
Copy link
Member Author

sk1p commented Sep 22, 2020

In the second case, we need tools to translate between the coordinates in the new navigation dimension and the navigation dimension of the underlying dataset.

Maybe the DataSet could provide a hook for coordinate transformation, which is then used by UDFMeta.coordinates - so the default implementation works like the implementation from #793 by @anandbaburajan, but the ROIDataSet can override this to provide the "original" coordinates, "punching" through to the underlying DataSet. As a next step, we can then see if we need to provide additional helper functions, or extensions to the UDF interface.

@uellue
Copy link
Member

uellue commented Sep 22, 2020

Maybe the DataSet could provide a hook for coordinate transformation, which is then used by UDFMeta.coordinates - so the default implementation works like the implementation from #793 by @anandbaburajan, but the ROIDataSet can override this to provide the "original" coordinates, "punching" through to the underlying DataSet.

Sounds good! 👍

@uellue
Copy link
Member

uellue commented Oct 23, 2020

FYI, in the Ptycho 4.0 call we just discussed this item. Since Simeon is picking up work on the CUDA/Alpaka side of this, we should put it on the short list after 0.6 is out. We can already test with SSB and @Sniper2k 's merge function, taking into account that SSB results change if you constrain it differently. :-)

@codecov
Copy link

codecov bot commented Dec 15, 2020

Codecov Report

Merging #764 (bf3b510) into master (f49e67b) will decrease coverage by 32.87%.
The diff coverage is 52.15%.

Impacted file tree graph

@@             Coverage Diff             @@
##           master     #764       +/-   ##
===========================================
- Coverage   68.57%   35.70%   -32.88%     
===========================================
  Files         260      261        +1     
  Lines       11751    11882      +131     
  Branches     1606     1618       +12     
===========================================
- Hits         8058     4242     -3816     
- Misses       3385     7434     +4049     
+ Partials      308      206      -102     
Impacted Files Coverage Δ
src/libertem/common/slice.py 59.18% <25.00%> (-25.93%) ⬇️
src/libertem/io/dataset/cluster.py 39.00% <33.33%> (-41.62%) ⬇️
src/libertem/io/dataset/cached.py 31.83% <36.36%> (-48.34%) ⬇️
src/libertem/io/dataset/roi.py 39.32% <39.32%> (ø)
src/libertem/io/dataset/blo.py 31.53% <50.00%> (-51.36%) ⬇️
src/libertem/io/dataset/dm.py 25.40% <50.00%> (-63.94%) ⬇️
src/libertem/io/dataset/empad.py 27.61% <50.00%> (-56.72%) ⬇️
src/libertem/io/dataset/frms6.py 25.07% <50.00%> (-47.30%) ⬇️
src/libertem/io/dataset/seq.py 31.09% <50.00%> (-43.91%) ⬇️
src/libertem/io/dataset/ser.py 31.90% <50.00%> (-49.08%) ⬇️
... and 113 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f49e67b...bf3b510. Read the comment docs.

`ranges` are a pre-defined list of `[start, stop)` tuples. This can,
for example, be used by the caching layer to get consistent
partitioning, or by ROI partitions which need overlapping ranges.
Also sprinkled some type annotations to improve vs code type inference.
* Add helper method: Slice.translate
* `RoiPartition` now has a slice associated, in the flat "outer"
  coordinate system
* Implement some new required methods:
    * `need_decode`, `adjust_tileshape`, `set_corrections`,
      `get_base_shape`, `get_io_backend`, `get_locations`, ...
* Update `seq` dataset: add `ranges` parameter for `get_partitions`
* UDFRunner: delegate `shape_for_roi` to `Partition`
* Fix `test_get_single_macrotile` assetion: add missing reshape
@uellue uellue modified the milestones: 0.7, 0.8 May 27, 2021
@sk1p sk1p modified the milestones: 0.8, backlog Aug 24, 2021
@uellue uellue mentioned this pull request Feb 14, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

UDF computations on overlapping ROIs with a common merge step
2 participants