WIP/RFC: Prepared parameters #1484

sk1p · 2023-08-08T12:51:58Z

Motivation: with LiberTEM-live, the workers should be ready to receive data as soon as the UDF run starts. If we have large-ish parameters, this is not the case - we first need to scatter parameters, before the actual run can start. This causes substantial jitter at the beginning, and putting data into the MP queues is also not properly parallelizable (I tried...). That means you have O(number_of_workers) work at the beginning, and in the worst case, your receiver code falls over, for example because the shared memory area fills up.

One possible solution is to pre-scatter constant parameters, and only passing a handle to the UDF once the data is scattered to the workers.

In my testing on offline data, with 24 workers and a ~200MiB parameter set, the latency from calling run_udf to running the first task on a worker reduced from ~1.9s to ~270ms. The UDF is a dummy no-op one, but the parameter set is a realistic set of pre-computed masks for a radial fourier analysis, and is similar in size to a mask stack for SSB.

Example code:

with ctx.prepare_parameter(radial_masks) as radial_masks_prepared:
    res = ctx.run_udf(ds, udf=MyNoopUDF(payload=radial_masks_prepared), plots=False)

Before:

After:

This is a request for discussion and exploration of the solution space, and won't be merged like this.

Questions:

How should the API look like? What else needs to be prepared? Plots?
For a general preparation API, what are the needed inputs?
How do we fix interaction on the main node? Right now, prepared parameters aren't resolved for main-node work, for example in the constructor.
Do we need additions in LiberTEM-live for proper preparation? For example, a placeholder "acquisition plan" object, as discussed before?

~~(PR includes changes from #1483, will rebase once that is merged.)~~

Contributor Checklist:

I have added or updated my entry in the creators.json file
I have added a changelog entry for my contribution
I have added/updated documentation for all user-facing changes
I have added/updated test cases
I have included the rebuilt production build of the client (only if changes were made to the GUI)

Reviewer Checklist:

/azp run libertem.libertem-data passed
No import of GPL code from MIT code

sk1p · 2023-08-08T13:00:06Z

Btw. in the after screenshot, there is still this delay of ~260ms, some of which is not accounted for by the current tracing. So there is still some more potential for reducing the initial latency, but prepared parameters shave off most of it.

matbryan52 · 2023-08-09T12:59:44Z

... What else needs to be prepared? Plots?

Just a quick comment on the plots part of this - I've noticed recently that we always call dry_run as part of _prepare_plots, even in the case the user is supplying pre-initialised plots:

LiberTEM/src/libertem/api.py

Lines 1421 to 1423 in 3255055

    
           def _prepare_plots(self, udfs, dataset, roi, plots): 
        
               runner_cls = self.executor.get_udf_runner() 
        
               dry_results = runner_cls.dry_run(udfs, dataset, roi)

even though if we have pre-instantiated plots we just return them directly :

LiberTEM/src/libertem/api.py

Lines 1446 to 1448 in 3255055

    
           # 3) plots is probably List[LivePlot]: use customized plots as they are 
        
           else: 
        
               return plots

In this case the dry_run results are not used. Optimising could shave off a bit of time ?

That said, we could use the opportunity to potentially update existing plots with the current nav/sig shape and any parameter changes. Either if the acqusition changed or a UDF / Plot which depends on some kind of extra_shape determined with params.

sk1p · 2023-08-09T15:39:54Z

... What else needs to be prepared? Plots?

Just a quick comment on the plots part of this - I've noticed recently that we always call dry_run as part of _prepare_plots, even in the case the user is supplying pre-initialised plots:

[...]

Great point! dry_results is only used in two of the branches, and could be skipped otherwise.

That said, we could use the opportunity to potentially update existing plots with the current nav/sig shape and any parameter changes. Either if the acqusition changed or a UDF / Plot which depends on some kind of extra_shape determined with params.

Ah, yeah, we need to figure out how/if that's possible with things like the bare matplotlib backend, but at least for those backends where its easily supported, we could add a way to re-use plot instances across these changes.

This is by no means finished, but it is already very useful for keeping constant parameters on the workers across UDF runs, and for making sure they are scattered well before the data from the detector arrives (in the live processing use case). Things to be figured out: - Do we want to make this a general "preparation" API, including preparing plots etc.? - How does this interact with the dynamic parameter updating API? - Tests

sk1p added enhancement New feature or request discussion labels Aug 8, 2023

sk1p force-pushed the prepared-parameters branch from dcaf603 to 8aebb11 Compare August 8, 2023 15:41

sk1p mentioned this pull request Aug 15, 2023

Focus and planning for the v0.13 release #1501

Closed

1 task

sk1p force-pushed the prepared-parameters branch from 8aebb11 to 215a847 Compare September 26, 2023 13:45

sk1p mentioned this pull request May 16, 2024

UDFs as LiberTEM GUI plugins #992

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

WIP/RFC: Prepared parameters #1484

WIP/RFC: Prepared parameters #1484

sk1p commented Aug 8, 2023 •

edited

sk1p commented Aug 8, 2023

matbryan52 commented Aug 9, 2023 •

edited

sk1p commented Aug 9, 2023

WIP/RFC: Prepared parameters #1484

Are you sure you want to change the base?

WIP/RFC: Prepared parameters #1484

Conversation

sk1p commented Aug 8, 2023 • edited

Contributor Checklist:

Reviewer Checklist:

sk1p commented Aug 8, 2023

matbryan52 commented Aug 9, 2023 • edited

sk1p commented Aug 9, 2023

sk1p commented Aug 8, 2023 •

edited

matbryan52 commented Aug 9, 2023 •

edited