Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exponential memory growth in UnionArray broadcasting #3118

Closed
Teddy-Curtis opened this issue May 14, 2024 · 10 comments · Fixed by #3119
Closed

Exponential memory growth in UnionArray broadcasting #3118

Teddy-Curtis opened this issue May 14, 2024 · 10 comments · Fixed by #3119
Labels
bug The problem described is something that must be fixed

Comments

@Teddy-Curtis
Copy link

Teddy-Curtis commented May 14, 2024

Version of Awkward Array

2.6.4

Description and code to reproduce

Hey!

To jump right in: I have recently switched from using awkward v1.10.3, to the newest version v2.6.4. I have, however, been running into some very weird issues when adding fields to union arrays.

So a bit of backstory:
I am working with electrons and muons, I then try and find dilepton pairs (e-e, mu-mu) and finally I am adding extra information about these dileptons (e.g. angle of dileptons to the missing energy of the event). For some reason, however, after switching to the newer version of awkward I am suddenly running into issues. Specifically, I am running into issues when I add the new fields to the dilepton array. For example, if I try and add new fields like this:
dileptons[("Dilepton", "mass")] = <Some array>
it will eventually crash from using too much memory, but this issue does not happen when I use awkward v1.10.3. Furthermore, even adding a single field without crashing seems to take longer in v2.6.4 than v1.10.3.

Here is a short snippet that hopefully highlights the issue that I am having. However, instead of having much larger arrays, to show what I mean I am using much smaller ones, and so I have to repeatedly apply the buggy line before the memory gets so big it crashes. In my actual code I am not just repeatedly trying to add the same field!

import awkward as ak
import vector
vector.register_awkward()

# Array of electrons
electrons = ak.Array(
        [[],
        [{'pt': 80.5, 'eta': -1.11, 'phi': 1.46, 'mass': 0.0236, "WP90" : True}, {'pt': 25.9, 'eta': -0.954, 'phi': 1.71, 'mass': 0.00993, "WP90" : True}],
        [{'pt': 40.8, 'eta': 0.449, 'phi': 1.88, 'mass': 0.00444, "WP90" : True}, {'pt': 14.7, 'eta': 0.787, 'phi': 3.04, 'mass': 0.00324, "WP90" : True}],
        [{'pt': 55.3, 'eta': 0.449, 'phi': -1.16, 'mass': -0.0147, "WP90" : True}, {'pt': 40.2, 'eta': 0.664, 'phi': -1.95, 'mass': 0.0129, "WP90" : True}]]
)
muons = ak.Array(
        [[{'pt': 41.2, 'eta': -2.24, 'phi': -0.45, 'mass': 0.106}, {'pt': 39.3, 'eta': -1.65, 'phi': -1.01, 'mass': 0.106, "tightID" : True}],
        [],
        [],
        []]
)

# Convert to Momentum4D vectors so that we can do vector operations e.g. add as 4-vectors
muons_4V = ak.Array(muons, with_name = "Momentum4D")
elecs_4V = ak.Array(electrons, with_name = "Momentum4D")

# Find e-e and mu-mu pairs
mm_pairs = ak.combinations(muons_4V, 2, fields = ["LeadLepton", "SubleadLepton"])
ee_pairs = ak.combinations(elecs_4V, 2, fields = ["LeadLepton", "SubleadLepton"])

# Concatenate these together
dileptons = ak.concatenate([ee_pairs, mm_pairs], axis = 1)

# Add the 4-vectors to get the 4-vector of the dilepton
dileptons["Dilepton"] = dileptons.LeadLepton + dileptons.SubleadLepton 


# 

mass = dileptons.Dilepton.mass

# This eventually crashes! It also goes very slow before it crashes!
for i in range(50):
    dileptons[("Dilepton", "mass")] = mass

I should note that this doesn't crash if I just ran once:

dileptons[("Dilepton", "mass")] = mass

It seems to be the creating of the field that retains something which causes memory to build up after repeated calls and then it crashes.

If I run the above code with the older awkward version then it doesn't crash.
If I do this in a jupyter notebook as well, I can't even run

%%timeit
dileptons[("Dilepton", "mass")] = mass

because this ends up crashing before it finishes.

Finally, I should add why I specifically mentioned union arrays.
As you can see in the electron and muon fields, the electron array contains the field "WP90" which is not in muons, and the muon array contains the field "tightID" which is not in the electron array.
Therefore when you concatenate the ee_pairs and mm_pairs you get a union array:

[[{LeadLepton: {pt: 41.2, eta: -2.24, ...}, SubleadLepton: {...}, ...}],
 [{LeadLepton: {pt: 80.5, eta: -1.11, ...}, SubleadLepton: {...}, ...}],
 [{LeadLepton: {pt: 40.8, eta: 0.449, ...}, SubleadLepton: {...}, ...}],
 [{LeadLepton: {pt: 55.3, eta: 0.449, ...}, SubleadLepton: {...}, ...}]]
------------------------------------------------------------------------
type: 4 * var * union[
    {
        LeadLepton: Momentum4D[
            pt: float64,
            eta: float64,
            phi: float64,
            mass: float64,
            WP90: bool
        ],
        SubleadLepton: Momentum4D[
            pt: float64,
            eta: float64,
            phi: float64,
            mass: float64,
            WP90: bool
        ],
        Dilepton: Momentum4D[
            rho: float64,
            phi: float64,
            eta: float64,
            tau: float64
        ]
    },
    {
        LeadLepton: Momentum4D[
            pt: float64,
            eta: float64,
            phi: float64,
            mass: float64,
            tightID: ?bool
        ],
        SubleadLepton: Momentum4D[
            pt: float64,
            eta: float64,
            phi: float64,
            mass: float64,
            tightID: ?bool
        ],
        Dilepton: Momentum4D[
            rho: float64,
            phi: float64,
            eta: float64,
            tau: float64
        ]
    }
]

If I now take out both the fields "WP90", "tightID" from the electron and muons arrays, respectively, I no longer get a union array (I guess because all the fields match):

[[{LeadLepton: {pt: 41.2, eta: -2.24, ...}, SubleadLepton: {...}, ...}],
 [{LeadLepton: {pt: 80.5, eta: -1.11, ...}, SubleadLepton: {...}, ...}],
 [{LeadLepton: {pt: 40.8, eta: 0.449, ...}, SubleadLepton: {...}, ...}],
 [{LeadLepton: {pt: 55.3, eta: 0.449, ...}, SubleadLepton: {...}, ...}]]
------------------------------------------------------------------------
type: 4 * var * {
    LeadLepton: Momentum4D[
        pt: float64,
        eta: float64,
        phi: float64,
        mass: float64
    ],
    SubleadLepton: Momentum4D[
        pt: float64,
        eta: float64,
        phi: float64,
        mass: float64
    ],
    Dilepton: Momentum4D[
        rho: float64,
        phi: float64,
        eta: float64,
        tau: float64
    ]
}

Now if I run the code above, but with the "WP90", "tightID" removed, it no longer crashes! So I am guessing that it is something to do with the union bit? Also not sure exactly why it becomes a union because as far as I can tell the fields "WP90", "tightID" aren't even accessible through the dilepton array?

Not exactly sure what is going on here, but would be interesting to hear your thoughts!

Thanks!

@Teddy-Curtis Teddy-Curtis added the bug (unverified) The problem described would be a bug, but needs to be triaged label May 14, 2024
@agoose77 agoose77 added bug The problem described is something that must be fixed and removed bug (unverified) The problem described would be a bug, but needs to be triaged labels May 14, 2024
@agoose77
Copy link
Collaborator

agoose77 commented May 14, 2024

@jpivarski @tcawlfield this is interesting! I think this is a leak / bug of some kind; even manually stepping through the loop eventually blows up my RSS and leads to paging. i.e. it feels non time-dependent (i.e. not gc; I checked manually!) and rather it feels stateful.

I would suspect this might be our growable buffer (due to the stateful-memory aspect), but I haven't done anything beyond checking this reproduces for me on main.

@jpivarski
Copy link
Member

Here's a simpler reproducer. I found that the behaviors are unnecessary and the first level of list-nesting is unnecessary (axis=10).

import awkward as ak

one_a = ak.Array([{"x": 1, "y": 2}], with_name="T")
one_b = ak.Array([{"x": 1, "y": 2}], with_name="T")
two_a = ak.Array([{"x": 1, "z": 3}], with_name="T")
two_b = ak.Array([{"x": 1, "z": 3}], with_name="T")
three = ak.Array([{"x": 1}, {"x": 1}], with_name="T")

first = ak.zip({"a": one_a, "b": one_b})
second = ak.zip({"a": two_a, "b": two_b})

cat = ak.concatenate([first, second], axis=0)

cat["another"] = three

for i in range(50):
    cat["another", "w"] = three.x

If you do fewer than 50 steps, you can examine this object. It takes a noticeably long time to show you its layout. The layout itself doesn't look bad, but it's odd that it's unresponsive. The issue must involve arrays that are somehow invisible, and they must get double in each step or something, so that the memory use can actually get large after only 50 steps (some $\mathcal{O}(\exp n)$ scaling).

I don't see how GrowableBuffers could be involved. ArrayBuilder and GrowableBuffers aren't used in the broadcast-and-apply step in the range(50) loop. (We could build one_a, one_b, two_a, two_b, three from layouts to avoid all ArrayBuilders, but I don't think that would help much.)

@jpivarski
Copy link
Member

I inserted

import psutil

p = psutil.Process()

at the top and

print(i, p.memory_full_info().uss / 1024**2)

in the loop and got

0 89.3359375
1 89.44140625
2 89.46484375
3 89.46875
4 89.47265625
5 89.484375
6 89.5
7 89.51953125
8 89.53125
9 89.54296875
10 89.546875
11 89.57421875
12 89.6328125
13 89.76171875
14 90.015625
15 90.5234375
16 91.5234375
17 93.5234375
18 97.5234375
19 105.52734375
20 121.52734375
21 153.5078125
22 185.53515625
23 249.53515625
24 377.54296875
25 633.546875
26 1145.546875
27 2169.55078125
28 4217.55078125
29 8313.55078125
30 16505.5546875

It doubles in every step.

@jpivarski
Copy link
Member

Just before the for loop, cat has value

[{a: {x: 1, y: 2}, b: {x: 1, y: 2}, another: {x: 1}},
 {a: {x: 1, z: 3}, b: {x: 1, z: 3}, another: {x: 1}}]

type

2 * union[
    {
        a: T[
            x: int64,
            y: int64
        ],
        b: T[
            x: int64,
            y: int64
        ],
        another: T[
            x: int64
        ]
    },
    {
        a: T[
            x: int64,
            z: int64
        ],
        b: T[
            x: int64,
            z: int64
        ],
        another: T[
            x: int64
        ]
    }
]

and layout

<UnionArray len='2'>
    <tags><Index dtype='int8' len='2'>
        [0 1]
    </Index></tags>
    <index><Index dtype='int64' len='2'>
        [0 0]
    </Index></index>
    <content index='0'>
        <RecordArray is_tuple='false' len='1'>
            <content index='0' field='a'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='y'>
                            <NumpyArray dtype='int64' len='1'>[2]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='1' field='b'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='y'>
                            <NumpyArray dtype='int64' len='1'>[2]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='2' field='another'>
                <RecordArray is_tuple='false' len='1'>
                    <parameter name='__record__'>'T'</parameter>
                    <content index='0' field='x'>
                        <IndexedArray len='1'>
                            <index><Index dtype='int64' len='1'>
                                [0]
                            </Index></index>
                            <content><NumpyArray dtype='int64' len='2'>[1 1]</NumpyArray></content>
                        </IndexedArray>
                    </content>
                </RecordArray>
            </content>
        </RecordArray>
    </content>
    <content index='1'>
        <RecordArray is_tuple='false' len='1'>
            <content index='0' field='a'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='z'>
                            <NumpyArray dtype='int64' len='1'>[3]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='1' field='b'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='z'>
                            <NumpyArray dtype='int64' len='1'>[3]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='2' field='another'>
                <RecordArray is_tuple='false' len='1'>
                    <parameter name='__record__'>'T'</parameter>
                    <content index='0' field='x'>
                        <IndexedArray len='1'>
                            <index><Index dtype='int64' len='1'>
                                [1]
                            </Index></index>
                            <content><NumpyArray dtype='int64' len='2'>[1 1]</NumpyArray></content>
                        </IndexedArray>
                    </content>
                </RecordArray>
            </content>
        </RecordArray>
    </content>
</UnionArray>

The NumpyArray for cat["another", "x"] in the first union branch is referentially identical to the one in the second union branch:

>>> (
...     cat.layout.contents[0].contents[2].contents[0].content
...     is cat.layout.contents[1].contents[2].contents[0].content
... )
True

None of that sounds wrong.

@jpivarski
Copy link
Member

Surrounding the call with Pympler summary tables unsurprisingly says that it's all in NumPy arrays:

                                       types |   # objects |   total size
============================================ | =========== | ============
                               numpy.ndarray |           5 |     16.00 GB
                                        code |           0 |     14.13 KB
                                        dict |           7 |   1024     B
                       weakref.ReferenceType |           5 |    360     B
                  builtin_function_or_method |           5 |    360     B
                           function (finder) |           2 |    288     B
                       awkward.index.Index64 |           4 |    192     B
  awkward.contents.indexedarray.IndexedArray |           2 |     96     B
                                        list |           1 |     72     B
                    functools._lru_list_elem |           1 |     56     B
                                       tuple |           1 |     48     B
      awkward.contents.numpyarray.NumpyArray |           1 |     48     B

@jpivarski
Copy link
Member

Since all NumPy array creations must go through an nplike, here:

def asarray(
self,
obj,
*,
dtype: DTypeLike | None = None,
copy: bool | None = None,
) -> ArrayLikeT | PlaceholderArray:
if isinstance(obj, PlaceholderArray):
assert obj.dtype == dtype or dtype is None
return obj
elif copy:
return self._module.array(obj, dtype=dtype, copy=True)
elif copy is None:
return self._module.asarray(obj, dtype=dtype)
else:
if getattr(obj, "dtype", dtype) != dtype:
raise ValueError(
"asarray was called with copy=False for an array of a different dtype"
)
else:
return self._module.asarray(obj, dtype=dtype)

I added a line to raise an exception if the new array's length is greater than 1000 (because they grow exponentially). Here's the stack trace:

>>> for i in range(30):
...     cat["another", "w"] = three.x
... 
Traceback (most recent call last):
  File "/home/jpivarski/irishep/awkward/src/awkward/highlevel.py", line 1145, in __setitem__
    self._layout = ak.operations.with_field(
  File "/home/jpivarski/irishep/awkward/src/awkward/_dispatch.py", line 64, in dispatch
    next(gen_or_result)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 48, in with_field
    return _impl(array, what, where, highlevel, behavior, attrs)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 68, in _impl
    _impl(base[where[0]], what, where[1:], highlevel, behavior, attrs),
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 512, in __getitem__
    return self._getitem(where)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 529, in _getitem
    return self._getitem_field(where)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/unionarray.py", line 564, in _getitem_field
    return UnionArray.simplified(
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/unionarray.py", line 371, in simplified
    contents[k] = contents[k]._mergemany([self_cont])
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/recordarray.py", line 745, in _mergemany
    merged = forfield[0]._mergemany(forfield[1:])
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/indexedarray.py", line 678, in _mergemany
    nextcontent = contents[0]._mergemany(tail_contents)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/numpyarray.py", line 526, in _mergemany
    next = NumpyArray(
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/numpyarray.py", line 119, in __init__
    self._data = backend.nplike.asarray(data)
  File "/home/jpivarski/irishep/awkward/src/awkward/_nplikes/array_module.py", line 66, in asarray
    raise Exception("HERE")
Exception: HERE

Going from bottom to top, contiguous_arrays is growing out of control here:

contiguous_arrays = []
parameters = self._parameters
for array in head:
if isinstance(array, ak.contents.EmptyArray):
continue
parameters = parameters_intersect(parameters, array._parameters)
if isinstance(array, ak.contents.NumpyArray):
contiguous_arrays.append(array.data)
else:
raise AssertionError(
"cannot merge "
+ type(self).__name__
+ " with "
+ type(array).__name__
)
contiguous_arrays = self._backend.nplike.concat(contiguous_arrays)

which comes from the contents here:

contents = []
contentlength_so_far = 0
length_so_far = 0
nextindex = ak.index.Index64.empty(
total_length, nplike=self._backend.index_nplike
)
parameters = self._parameters
for array in head:
if isinstance(array, ak.contents.EmptyArray):
continue
if isinstance(
array,
(
ak.contents.ByteMaskedArray,
ak.contents.BitMaskedArray,
ak.contents.UnmaskedArray,
),
):
array = array.to_IndexedOptionArray64()
if isinstance(
array, (ak.contents.IndexedOptionArray, ak.contents.IndexedArray)
):
parameters = parameters_intersect(parameters, array._parameters)
contents.append(array.content)
array_index = array.index
assert (
nextindex.nplike is self._backend.index_nplike
and array_index.nplike is self._backend.index_nplike
)
self._backend.maybe_kernel_error(
self._backend[
"awkward_IndexedArray_fill",
nextindex.dtype.type,
array_index.dtype.type,
](
nextindex.data,
length_so_far,
array_index.data,
array.length,
contentlength_so_far,
)
)
contentlength_so_far += array.content.length
length_so_far += array.length
else:
contents.append(array)
assert nextindex.nplike is self._backend.index_nplike
self._backend.maybe_kernel_error(
self._backend[
"awkward_IndexedArray_fill_count",
nextindex.dtype.type,
](
nextindex.data,
length_so_far,
array.length,
contentlength_so_far,
)
)
contentlength_so_far += array.length
length_so_far += array.length
# Categoricals may only survive if all contents are categorical
if (
parameters is not None
and parameters.get("__array__") == "categorical"
):
parameters = {**parameters}
del parameters["__array__"]
tail_contents = contents[1:]
nextcontent = contents[0]._mergemany(tail_contents)

which are arrays in one of the slots of for_each_field here:

for_each_field = []
for field in self.contents:
trimmed = field[0 : self.length]
for_each_field.append([trimmed])
if self.is_tuple:
for array in headless:
if isinstance(array, ak.contents.EmptyArray):
continue
parameters = parameters_intersect(parameters, array._parameters)
if isinstance(array, ak.contents.RecordArray):
if self.is_tuple:
if len(self.contents) == len(array.contents):
for i in range(len(self.contents)):
field = array[self.index_to_field(i)]
for_each_field[i].append(field[0 : array.length])
else:
raise ValueError(
"cannot merge tuples with different numbers of fields"
)
else:
raise ValueError("cannot merge tuple with non-tuple record")
else:
raise AssertionError(
"cannot merge "
+ type(self).__name__
+ " with "
+ type(array).__name__
)
else:
these_fields = self._fields.copy()
these_fields.sort()
for array in headless:
parameters = parameters_intersect(parameters, array._parameters)
if isinstance(array, ak.contents.RecordArray):
if not array.is_tuple:
those_fields = array._fields.copy()
those_fields.sort()
if these_fields == those_fields:
for i in range(len(self.contents)):
field = array[self.index_to_field(i)]
trimmed = field[0 : array.length]
for_each_field[i].append(trimmed)
else:
raise AssertionError(
"cannot merge records with different sets of field names"
)
else:
raise AssertionError("cannot merge non-tuple record with tuple")
elif isinstance(array, ak.contents.EmptyArray):
pass
else:
raise AssertionError(
"cannot merge "
+ type(self).__name__
+ " with "
+ type(array).__name__
)
nextcontents = []
minlength = ak._util.UNSET
for forfield in for_each_field:
merged = forfield[0]._mergemany(forfield[1:])

which is one of the contents here:

contents = []
# For each outer union content
for i, self_cont in enumerate(self_contents):
# Is one of our new contents also a union?
if isinstance(self_cont, UnionArray):
innertags = self_cont._tags
innerindex = self_cont._index
innercontents = self_cont._contents
# Update outermost parameters with this union's parameters
parameters = parameters_union(self_cont._parameters, parameters)
# For each inner union content
for j, inner_cont in enumerate(innercontents):
unmerged = True
# For each "final" outer union content
for k in range(len(contents)):
# Try and merge inner union content with running outer-union contentca
if contents[k]._mergeable_next(inner_cont, mergebool):
backend.maybe_kernel_error(
backend[
"awkward_UnionArray_simplify",
tags.dtype.type,
index.dtype.type,
self_tags.dtype.type,
self_index.dtype.type,
innertags.dtype.type,
innerindex.dtype.type,
](
tags.data,
index.data,
self_tags.data,
self_index.data,
innertags.data,
innerindex.data,
k,
j,
i,
length,
contents[k].length,
)
)
contents[k] = contents[k]._mergemany([inner_cont])
unmerged = False
break
# Did we fail to merge any of the final outer contents with this inner union content?
if unmerged:
backend.maybe_kernel_error(
backend[
"awkward_UnionArray_simplify",
tags.dtype.type,
index.dtype.type,
self_tags.dtype.type,
self_index.dtype.type,
innertags.dtype.type,
innerindex.dtype.type,
](
tags.data,
index.data,
self_tags.data,
self_index.data,
innertags.data,
innerindex.data,
len(contents),
j,
i,
length,
0,
)
)
contents.append(inner_cont)
else:
unmerged = True
for k in range(len(contents)):
if contents[k] is self_cont:
backend.maybe_kernel_error(
backend[
"awkward_UnionArray_simplify_one",
tags.dtype.type,
index.dtype.type,
self_tags.dtype.type,
self_index.dtype.type,
](
tags.data,
index.data,
self_tags.data,
self_index.data,
k,
i,
length,
0,
)
)
unmerged = False
break
elif contents[k]._mergeable_next(self_cont, mergebool):
backend.maybe_kernel_error(
backend[
"awkward_UnionArray_simplify_one",
tags.dtype.type,
index.dtype.type,
self_tags.dtype.type,
self_index.dtype.type,
](
tags.data,
index.data,
self_tags.data,
self_index.data,
k,
i,
length,
contents[k].length,
)
)
contents[k] = contents[k]._mergemany([self_cont])

I don't think the issue goes any higher.

@jpivarski
Copy link
Member

jpivarski commented May 14, 2024

The exponential memory growth (not a memory leak) is not related to the internal cross-references (the fact that cat.layout.contents[0].contents[2].contents[0].content is cat.layout.contents[1].contents[2].contents[0].content) because if I break that connection by projecting the two IndexedArrays into simple NumpyArrays:

cat.layout.contents[0].contents[2]._contents[0] = cat.layout.contents[0].contents[2].contents[0].project()
cat.layout.contents[1].contents[2]._contents[0] = cat.layout.contents[1].contents[2].contents[0].project()

the issue persists. Implicitly, this also rules out the IndexedArray part of the stack trace.

I doubt it has anything to do with the NumpyArray or RecordArray parts, either. It's very likely an issue in UnionArray.simplified, particularly since the issue only ever happens if the arrays are UnionArrays.

@jpivarski jpivarski changed the title Memory Leak when adding fields to Union Arrays Exponential memory growth in UnionArray broadcasting May 14, 2024
@jpivarski
Copy link
Member

Even if we use the above to project out the original IndexedArray, (cat.layout just before the loop):

<UnionArray len='2'>
    <tags><Index dtype='int8' len='2'>
        [0 1]
    </Index></tags>
    <index><Index dtype='int64' len='2'>
        [0 0]
    </Index></index>
    <content index='0'>
        <RecordArray is_tuple='false' len='1'>
            <content index='0' field='a'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='y'>
                            <NumpyArray dtype='int64' len='1'>[2]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='1' field='b'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='y'>
                            <NumpyArray dtype='int64' len='1'>[2]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='2' field='another'>
                <RecordArray is_tuple='false' len='1'>
                    <parameter name='__record__'>'T'</parameter>
                    <content index='0' field='x'>
                        <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                    </content>
                </RecordArray>
            </content>
        </RecordArray>
    </content>
    <content index='1'>
        <RecordArray is_tuple='false' len='1'>
            <content index='0' field='a'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='z'>
                            <NumpyArray dtype='int64' len='1'>[3]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='1' field='b'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='z'>
                            <NumpyArray dtype='int64' len='1'>[3]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='2' field='another'>
                <RecordArray is_tuple='false' len='1'>
                    <parameter name='__record__'>'T'</parameter>
                    <content index='0' field='x'>
                        <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                    </content>
                </RecordArray>
            </content>
        </RecordArray>
    </content>
</UnionArray>

(See? The field='another' part of the outer RecordArray has a one-field RecordArray of just NumpyArray—no IndexedArray.)

Then, running the for loop for 30 steps turns it into

<UnionArray len='2'>
    <tags><Index dtype='int8' len='2'>
        [0 1]
    </Index></tags>
    <index><Index dtype='int64' len='2'>
        [0 0]
    </Index></index>
    <content index='0'>
        <RecordArray is_tuple='false' len='1'>
            <content index='0' field='a'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='y'>
                            <NumpyArray dtype='int64' len='1'>[2]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='1' field='b'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='y'>
                            <NumpyArray dtype='int64' len='1'>[2]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='2' field='another'>
                <RecordArray is_tuple='false' len='1'>
                    <parameter name='__record__'>'T'</parameter>
                    <content index='0' field='x'>
                        <IndexedArray len='1'>
                            <index><Index dtype='int64' len='1'>[0]</Index></index>
                            <content><NumpyArray dtype='int64' len='1073741824'>[1 1 1 ... 1 1 1]</NumpyArray></content>
                        </IndexedArray>
                    </content>
                    <content index='1' field='w'>
                        <IndexedArray len='1'>
                            <index><Index dtype='int64' len='1'>
                                [0]
                            </Index></index>
                            <content><NumpyArray dtype='int64' len='2'>[1 1]</NumpyArray></content>
                        </IndexedArray>
                    </content>
                </RecordArray>
            </content>
        </RecordArray>
    </content>
    <content index='1'>
        <RecordArray is_tuple='false' len='1'>
            <content index='0' field='a'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='z'>
                            <NumpyArray dtype='int64' len='1'>[3]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='1' field='b'>
                <IndexedArray len='1'>
                    <index><Index dtype='int64' len='1'>[0]</Index></index>
                    <content><RecordArray is_tuple='false' len='1'>
                        <parameter name='__record__'>'T'</parameter>
                        <content index='0' field='x'>
                            <NumpyArray dtype='int64' len='1'>[1]</NumpyArray>
                        </content>
                        <content index='1' field='z'>
                            <NumpyArray dtype='int64' len='1'>[3]</NumpyArray>
                        </content>
                    </RecordArray></content>
                </IndexedArray>
            </content>
            <content index='2' field='another'>
                <RecordArray is_tuple='false' len='1'>
                    <parameter name='__record__'>'T'</parameter>
                    <content index='0' field='x'>
                        <IndexedArray len='1'>
                            <index><Index dtype='int64' len='1'>[1073741823]</Index></index>
                            <content><NumpyArray dtype='int64' len='1073741824'>[1 1 1 ... 1 1 1]</NumpyArray></content>
                        </IndexedArray>
                    </content>
                    <content index='1' field='w'>
                        <IndexedArray len='1'>
                            <index><Index dtype='int64' len='1'>
                                [1]
                            </Index></index>
                            <content><NumpyArray dtype='int64' len='2'>[1 1]</NumpyArray></content>
                        </IndexedArray>
                    </content>
                </RecordArray>
            </content>
        </RecordArray>
    </content>
</UnionArray>

The x field has been turned into an IndexedArray whose content has length 2**30, and the IndexedArray just picks out the last element.

@jpivarski
Copy link
Member

Projecting away all of the IndexedArrays,

cat.layout.contents[0]._contents[0] = cat.layout.contents[0].contents[0].project()
cat.layout.contents[1]._contents[0] = cat.layout.contents[1].contents[0].project()
cat.layout.contents[0]._contents[1] = cat.layout.contents[0].contents[1].project()
cat.layout.contents[1]._contents[1] = cat.layout.contents[1].contents[1].project()
cat.layout.contents[0].contents[2]._contents[0] = cat.layout.contents[0].contents[2].contents[0].project()
cat.layout.contents[1].contents[2]._contents[0] = cat.layout.contents[1].contents[2].contents[0].project()

and raising an exception the next time an IndexedArray is constructed provides this stack trace:

Traceback (most recent call last):
  File "/home/jpivarski/irishep/awkward/src/awkward/highlevel.py", line 1145, in __setitem__
    self._layout = ak.operations.with_field(
  File "/home/jpivarski/irishep/awkward/src/awkward/_dispatch.py", line 64, in dispatch
    next(gen_or_result)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 48, in with_field
    return _impl(array, what, where, highlevel, behavior, attrs)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 68, in _impl
    _impl(base[where[0]], what, where[1:], highlevel, behavior, attrs),
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 512, in __getitem__
    return self._getitem(where)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 529, in _getitem
    return self._getitem_field(where)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/unionarray.py", line 564, in _getitem_field
    return UnionArray.simplified(
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/unionarray.py", line 434, in simplified
    next = contents[0]._carry(index, True)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/recordarray.py", line 510, in _carry
    return ak.contents.IndexedArray(nextindex, self, parameters=None)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/indexedarray.py", line 142, in __init__
    raise Exception("HERE")
Exception: HERE

But that's not it—even with allow_lazy=False in

next = contents[0]._carry(index, True)

we get another IndexedArray creation with this stack trace:

Traceback (most recent call last):
  File "/home/jpivarski/irishep/awkward/src/awkward/highlevel.py", line 1145, in __setitem__
    self._layout = ak.operations.with_field(
  File "/home/jpivarski/irishep/awkward/src/awkward/_dispatch.py", line 64, in dispatch
    next(gen_or_result)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 48, in with_field
    return _impl(array, what, where, highlevel, behavior, attrs)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 66, in _impl
    return _impl(
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 158, in _impl
    out = ak._broadcasting.broadcast_and_apply(
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 968, in broadcast_and_apply
    out = apply_step(
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 946, in apply_step
    return continuation()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 915, in continuation
    return broadcast_any_list()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 622, in broadcast_any_list
    outcontent = apply_step(
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 946, in apply_step
    return continuation()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 907, in continuation
    return broadcast_any_union()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 809, in broadcast_any_union
    nextinputs.append(x[mask].project(next(it_j_contents)))
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/unionarray.py", line 705, in project
    return self._contents[index]._carry(nextcarry, True)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/recordarray.py", line 510, in _carry
    return ak.contents.IndexedArray(nextindex, self, parameters=None)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/indexedarray.py", line 142, in __init__
    raise Exception("HERE")
Exception: HERE

And with allow_lazy=False in

return self._contents[index]._carry(nextcarry, True)

we get

Traceback (most recent call last):
  File "/home/jpivarski/irishep/awkward/src/awkward/highlevel.py", line 1145, in __setitem__
    self._layout = ak.operations.with_field(
  File "/home/jpivarski/irishep/awkward/src/awkward/_dispatch.py", line 64, in dispatch
    next(gen_or_result)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 48, in with_field
    return _impl(array, what, where, highlevel, behavior, attrs)
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 66, in _impl
    return _impl(
  File "/home/jpivarski/irishep/awkward/src/awkward/operations/ak_with_field.py", line 158, in _impl
    out = ak._broadcasting.broadcast_and_apply(
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 968, in broadcast_and_apply
    out = apply_step(
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 946, in apply_step
    return continuation()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 915, in continuation
    return broadcast_any_list()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 622, in broadcast_any_list
    outcontent = apply_step(
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 946, in apply_step
    return continuation()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 907, in continuation
    return broadcast_any_union()
  File "/home/jpivarski/irishep/awkward/src/awkward/_broadcasting.py", line 811, in broadcast_any_union
    nextinputs.append(x[mask])
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 512, in __getitem__
    return self._getitem(where)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 657, in _getitem
    return self._getitem(layout)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/content.py", line 614, in _getitem
    self._carry(carry, allow_lazy), where.shape
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/recordarray.py", line 510, in _carry
    return ak.contents.IndexedArray(nextindex, self, parameters=None)
  File "/home/jpivarski/irishep/awkward/src/awkward/contents/indexedarray.py", line 142, in __init__
    raise Exception("HERE")
Exception: HERE

Turning off these two allow_lazy options:

--- a/src/awkward/contents/unionarray.py
+++ b/src/awkward/contents/unionarray.py
@@ -431,7 +431,7 @@ class UnionArray(UnionMeta[Content], Content):
             ]
 
         if len(contents) == 1:
-            next = contents[0]._carry(index, True)
+            next = contents[0]._carry(index, False)
             return next.copy(parameters=parameters_union(next._parameters, parameters))
 
         else:
@@ -702,7 +702,7 @@ class UnionArray(UnionMeta[Content], Content):
         nextcarry = ak.index.Index64(
             tmpcarry.data[: lenout[0]], nplike=self._backend.index_nplike
         )
-        return self._contents[index]._carry(nextcarry, True)
+        return self._contents[index]._carry(nextcarry, False)
 
     @staticmethod
     def regular_index(

is sufficient to stop the memory explosion. Turning them both off is also necessary: just one isn't enough to do it.

So now I need to think. Should these be turned off? Well,

  • allow_lazy=False is always correct, but can be a performance issue—it forces RecordArrays to be sliced deeply, and they may have many fields.
  • allow_lazy=True can sometimes cause this kind of issue: a lazy operation of a lazy operation builds up the (implicit) graph of slices. Fortunately, we squash IndexedArrays that contain IndexedArrays, so we don't end up with a deep tree of them. What we got instead was runaway content growth.
  • This happens in UnionArray, and UnionArray is our least-supported node type. If we have worse performance for wide RecordArrays in UnionArray, that's a (not too uncommon) corner-case of a (quite uncommon) corner case. Wrong results would be unacceptable, but worse performance in a corner of a corner?
  • On the other hand, UnionArray.simplify happens in every concatenate, and this is a common operation.

Maybe the solution should be highly targeted to this situation: identify and catch content growth in RecordArray._carry with allow_lazy=True?

@jpivarski jpivarski linked a pull request May 14, 2024 that will close this issue
@jpivarski
Copy link
Member

I decided to just set allow_lazy=False. The results won't be incorrect, and I don't think we'll see any bad performance because of this. (I think this is a case of trying it and finding out, since the definition of "good performance" depends on specific test-cases. In particular, real-world test-cases.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug The problem described is something that must be fixed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants