PickShiftedUDF #1635

sivborg · 2024-05-03T12:47:00Z

Contributor Checklist:

I have added or updated my entry in the creators.json file
I have added a changelog entry for my contribution
I have added/updated documentation for all user-facing changes
I have added/updated test cases
I have included the rebuilt production build of the client (only if changes were made to the GUI)

Reviewer Checklist:

/azp run libertem.libertem-data passed
No import of GPL code from MIT code

This is an addition of a new PickUDF that is able to shift the data that is retrieved. This is useful because it makes it possible to correct raw data for unintended shifts, for example descan which can obstruct electromagnetic signal in STEM-DPC. This functions as the previous PickUDF, but you can give corrective regression parameters indicating how much to shift the data. This can be found for example by using the CoMUDF:

from libertem import api
from libertem.udf.com import CoMUDF
from libertem.udf.raw import PickShiftedUDF

ctx = api.Context()

dataset = ctx.load('dask', dask_array=s.data, sig_dims=2)

comudf = CoMUDF.with_params(cy=128, cx=128, r=1000, regression=1)
comresult = ctx.run_udf(dataset=dataset, udf=comudf)

pickudf = PickShiftedUDF(regression_coefficients=comresult["regression"].data)
result = ctx.run_udf(udf=pickudf, dataset=dataset, roi=(10,10))

The way this is implemented is with scipy.ndimage.shift, done as a postprocessing step in the UDF. The shifts are found with the self.meta.coordinates attribute, which is passed around as a result buffer. So perhaps there is a cleaner way of doing this. The UDF seems to work well for my practical purposes and I have set up some accompanying tests.

codecov · 2024-05-03T12:55:50Z

Codecov Report

Attention: Patch coverage is 84.78261% with 7 lines in your changes are missing coverage. Please review.

Project coverage is 69.89%. Comparing base (59529c1) to head (791ce78).

❗ Current head 791ce78 differs from pull request most recent head d3dd4a6. Consider uploading reports for the commit d3dd4a6 to get more accurate results

Files	Patch %	Lines
src/libertem/udf/raw.py	84.78%	3 Missing and 4 partials ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #1635      +/-   ##
==========================================
+ Coverage   69.85%   69.89%   +0.03%     
==========================================
  Files         325      325              
  Lines       27255    27299      +44     
  Branches     3139     3147       +8     
==========================================
+ Hits        19039    19080      +41     
  Misses       7647     7647              
- Partials      569      572       +3

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

uellue

Looks like a good idea and a good start!

For changelog you can add a file in docs/source/changelog/features/. There are already two, you can just follow the pattern. :-)

uellue · 2024-05-06T08:22:50Z

src/libertem/udf/raw.py

+    Parameters
+    ----------
+    regression_coefficients : (3, 2) matrix
+        A vector of regression coefficients for x- and y-directions.
+        The regression coefficients make up two planes that describe
+        the descan and are used to remove it.
+        The coefficients are arranged in the following manner:
+        ((y, x), (dy/dy, dx/dy), (dy/dx, dx/dx)) where y and x are
+        the shifts at coordinates (0, 0). The displayed image will be
+        shifted in x- and y-directions by these parameters. By default
+        `None`, which corresponds to no shifts.
+    '''


What about supporting the same options as CoMUDF, or even use the CoMParams class:

LiberTEM/src/libertem/udf/com.py

Lines 30 to 44 in 59529c1

class CoMParams(NamedTuple):

"""

Used to instantiate :class:`~libertem.udf.com.CoMUDF`.

See :meth:`~libertem.udf.com.CoMUDF.with_params`

for descriptions of each parameter. In most cases use this

method rather than instantiating this class directly.

"""

cy: Optional[float] = None

cx: Optional[float] = None

r: float = float('inf')

ri: Union[float, None] = 0.

scan_rotation: float = 0.

flip_y: bool = False

regression: RegressionOptionsT = RegressionOptions.NO_REGRESSION

scan_rotation and flip_y would apply perfectly for this use case and should be easy to implement with the scipy.ndimage functions.

In particular when dealing with DPC, it can help if in the picked frames everything points in the right directions relative to the scan! By using the same parameters, one can use them for both. As a side effect one could use this to create a corrected dataset! Could be merged with corresponding code in pyxem as a perspective?

uellue · 2024-05-06T08:28:49Z

src/libertem/udf/raw.py

+    def get_results(self):
+
+        intensity = self.results.get_buffer('intensity').data
+
+        regression_coefficients = np.array(self.params.regression_coefficients)
+        # Only do shifts if nonzero regression coefficients
+        if regression_coefficients.any():
+
+            coordinates = self.results.get_buffer('coordinates').raw_data
+
+            prefix_ones = np.ones((*coordinates.shape[:-1], 1))
+            coordinates = np.concatenate((prefix_ones, coordinates), axis=-1)
+            shifts = np.dot(coordinates, self.params.regression_coefficients)
+
+            shifted = []
+            for intens, shift in zip(intensity, shifts):
+                shifted.append(scipy.ndimage.shift(intens, -shift, mode="constant"))
+
+            if shifted:
+                intensity = np.stack(shifted)
+
+        return {
+            "intensity": intensity,
+            "coordinates": self.results.get_buffer('coordinates').raw_data
+        }


What about using process_frame() and doing all this there? It would be easier to read, and the processing would be performed on the workers, not on the main process.

Later one could look into using process_partition() or postprocess() to process larger chunks, which would be beneficial for using GPUs.

uellue · 2024-05-06T08:36:45Z

tests/udf/test_pick.py

+
+
+def test_pick_shifted(lt_ctx, delayed_ctx):
+    data = _mk_random(size=(16, 16, 16, 16), dtype="float32")
+    # data = np.ones((16, 16, 16, 16), dtype="float32")
+    # data = np.arange(0, 16*16*16*16, dtype="float32").reshape((16, 16, 16, 16))
+    dataset = MemoryDataSet(data=data, tileshape=(3, 7, 16),
+                            num_partitions=7, sig_dims=2)
+    roi = np.random.choice([True, False], size=dataset.shape.nav)
+    roi[0] = True
+
+    regression_coefficients = np.array([(3, 3), (0, 0), (0, 0)])
+
+    udf = PickShiftedUDF(regression_coefficients=regression_coefficients)
+    res = lt_ctx.run_udf(dataset=dataset, udf=udf, roi=roi)
+    res_delayed = delayed_ctx.run_udf(dataset=dataset, udf=udf, roi=roi)
+
+    assert np.allclose(data[roi][..., 3:, 3:], res['intensity'].data[..., :-3, :-3], 1e-15)
+    assert np.allclose(data[roi][..., 3:, 3:], res_delayed['intensity'].data[..., :-3, :-3], 1e-15)
+
+    assert data.dtype == res['intensity'].data.dtype
+    assert data.dtype == res_delayed['intensity'].data.dtype
+
+
+def test_pick_shifted_empty_roi(lt_ctx, delayed_ctx):
+    data = _mk_random(size=(16, 16, 16, 16), dtype="float32")
+    dataset = MemoryDataSet(data=data, tileshape=(3, 7, 7),
+                            num_partitions=7, sig_dims=2)
+    roi = np.zeros(dataset.shape.nav, dtype=bool)
+
+    regression_coefficients = np.array([(3, 3), (0, 0), (0, 0)])
+
+    udf = PickShiftedUDF(regression_coefficients=regression_coefficients)
+    res = lt_ctx.run_udf(dataset=dataset, udf=udf, roi=roi)
+    res_delayed = delayed_ctx.run_udf(dataset=dataset, udf=udf, roi=roi)
+
+    assert np.allclose(data[roi], res['intensity'].data)
+    assert np.allclose(data[roi], res_delayed['intensity'].data)
+
+    assert data[roi].shape == res['intensity'].data.shape
+    assert data[roi].shape == res_delayed['intensity'].data.shape
+
+    assert data.dtype == res['intensity'].data.dtype
+    assert data.dtype == res_delayed['intensity'].data.dtype


The regression coefficients should all be different to make sure we are not mixing them anywhere. In particular, the dy/dx ... coefficients require tests.

We should also confirm that the coefficients act the same as in CoMUDF, in particular if we support more parameters, as suggested above. A simple and comprehensive test would be comparing CoMUDF on an uncorrected dataset vs. CoMUDF on a corrected dataset.

uellue · 2024-05-06T08:38:37Z

src/libertem/udf/raw.py

+
+
+class PickShiftedUDF(UDF):
+    '''


Suggested change

'''

'''

.. versionadded:: 0.15.0

uellue · 2024-05-06T08:44:29Z

src/libertem/udf/raw.py

-    def __init__(self):
-        super().__init__()
+    def __init__(self, **kwargs):
+        super().__init__(**kwargs)


Hm, why this change? The PickShiftedUDF is subclassed from UDF, so PickUDF doesn't need to pass parameters through. If one wanted a common subclass to share code, it would be cleaner to have a BasePickUDF with the common parts, and derive both PickUDF and PickShiftedUDF from that, each with a final constructor. That way the parameters are clear for autocompletion etc.

uellue · 2024-05-06T13:37:42Z

/azp run libertem.libertem-data

azure-pipelines · 2024-05-06T13:37:53Z

Azure Pipelines successfully started running 1 pipeline(s).

sjdagenb added 5 commits May 2, 2024 14:24

Created corrected picker for descan correction

eaf62e3

Renamed PickCorrectedUDF to PickShifterUDF

1dd739f

Skip shifts if all coefficients are zero

32a732d

Added PickShiftedUDF tests

32c4fc2

Small code fixes

791ce78

uellue requested changes May 6, 2024

View reviewed changes

Light refactoring of common functionality

d3dd4a6

sk1p added this to the 0.15 milestone May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PickShiftedUDF #1635

PickShiftedUDF #1635

sivborg commented May 3, 2024 •

edited

codecov bot commented May 3, 2024 •

edited

uellue left a comment

uellue May 6, 2024

uellue May 6, 2024

uellue May 6, 2024

uellue May 6, 2024

uellue May 6, 2024

uellue commented May 6, 2024

azure-pipelines bot commented May 6, 2024

	class CoMParams(NamedTuple):
	"""
	Used to instantiate :class:`~libertem.udf.com.CoMUDF`.

	See :meth:`~libertem.udf.com.CoMUDF.with_params`
	for descriptions of each parameter. In most cases use this
	method rather than instantiating this class directly.
	"""
	cy: Optional[float] = None
	cx: Optional[float] = None
	r: float = float('inf')
	ri: Union[float, None] = 0.
	scan_rotation: float = 0.
	flip_y: bool = False
	regression: RegressionOptionsT = RegressionOptions.NO_REGRESSION

-    '''
+    '''
+    .. versionadded:: 0.15.0

PickShiftedUDF #1635

Are you sure you want to change the base?

PickShiftedUDF #1635

Conversation

sivborg commented May 3, 2024 • edited

Contributor Checklist:

Reviewer Checklist:

codecov bot commented May 3, 2024 • edited

Codecov Report

uellue left a comment

Choose a reason for hiding this comment

uellue May 6, 2024

Choose a reason for hiding this comment

uellue May 6, 2024

Choose a reason for hiding this comment

uellue May 6, 2024

Choose a reason for hiding this comment

uellue May 6, 2024

Choose a reason for hiding this comment

uellue May 6, 2024

Choose a reason for hiding this comment

uellue commented May 6, 2024

azure-pipelines bot commented May 6, 2024

sivborg commented May 3, 2024 •

edited

codecov bot commented May 3, 2024 •

edited