-
Notifications
You must be signed in to change notification settings - Fork 68
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prototype for a shared memory mask container #1005
base: master
Are you sure you want to change the base?
Conversation
This prototype demonstrates the core functionality of a shared memory MaskContainer. At this stage, it just implements a method to share computed_masks. This is not included in MaskContainer yet, but as stand-alone testing code with many tracing `print()`s. Sharing tile slices should work analogously to computed_masks. * The name to address objects is a function of mask_factories * Determine likely availability of pre-computed masks through a canary object * Fallback to local masks after timeout -- to be adjusted * Supports both dense and sparse computed_masks * Supports additional metadata as JSON Lessons learned: Life time management is a bit tricky. If the reference count for the buffer goes to zero, the shared memory can be purged immediately, apparently. Balancing several interdependent objects requires making sure that objects will most likely stay alive, and having a fallback in case a race condition occurs. In this case, a separate process is started whose only job is to keep references around so that objects will not be purged between processing "partitions" (run_for_partition() emulated with multiprocessing). A race condition between the "keep-alive" process and the object creation can occur. `join()`ing the queue after data has been stored makes sure that keys are safe and secure before proceeding and having variables fall out of scope or worker processes terminating. The only IPC/orchestration in this example is a queue to keep track of the buffer names so that a reference can be kept. Other than that, this is self-organizing and seems to handle a "thundering herd" quite well: One process is first and creates the canary, the others wait for it to finish the remaining items and then pick them up. Fr a real MaskContainer this could be improved by making different workers request different tiles from the MaskContainer first, so that the tile slice calculation is parallelized.
Refs #335 |
Codecov Report
@@ Coverage Diff @@
## master #1005 +/- ##
==========================================
- Coverage 69.02% 61.79% -7.23%
==========================================
Files 262 262
Lines 12063 12063
Branches 1655 1655
==========================================
- Hits 8326 7454 -872
- Misses 3417 4329 +912
+ Partials 320 280 -40
Continue to review full report at Codecov.
|
@sk1p Do you have your "plasma" code somewhere as a prototype? I've approached the issue bottom-up here, starting with the requirements of |
See #1006 |
Thx! :-) This looks like it will complement each other nicely. Probably after Easter. :-) |
This prototype demonstrates the core functionality of a shared memory MaskContainer.
At this stage, it just implements a method to share computed_masks.
This is not included in MaskContainer yet, but as stand-alone testing code with many tracing
print()
s. Sharing tile slices should work analogously to computed_masks.Lessons learned:
Life time management is a bit tricky. If the reference count for the buffer goes to zero,
the shared memory can be purged immediately, apparently. Balancing several interdependent objects
requires making sure that objects will most likely stay alive, and having a fallback
in case a race condition occurs. In this prototype, a separate process is started whose only
job is to keep references around so that objects will not be purged between processing
"partitions" (run_for_partition() emulated with multiprocessing).
A race condition between the "keep-alive" process and the object creation can occur.
join()
ing the queue after data has been stored makes sure that keys are safe and securebefore proceeding and having variables fall out of scope or worker processes terminating.
The only IPC/orchestration in this example is a queue to keep track of the buffer names so
that a reference can be kept. Other than that, this is self-organizing and seems to handle a
"thundering herd" quite well: One process is first and creates the canary, the others wait
for it to finish the remaining items and then pick them up. Fr a real MaskContainer this
could be improved by making different workers request different tiles from the MaskContainer
first, so that the tile slice calculation is parallelized.
Contributor Checklist:
changes were made to the GUI)
Reviewer Checklist:
/azp run libertem.libertem-data
passed