Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copy On Write for the object state #47

Open
CastixGitHub opened this issue May 22, 2022 · 1 comment
Open

Copy On Write for the object state #47

CastixGitHub opened this issue May 22, 2022 · 1 comment

Comments

@CastixGitHub
Copy link
Contributor

answer to #45 (comment)

I implemented it using a high level approach, should be good enough to be used with pymongo dictionaries, but I didn't think about supporting other kind of objects (that would lead again to deepcopy, unless we are able to drop the abstractions and go low level, but I didn't find anything about copy on write on the python ecosystem (this is an interesting read but unrelated) so i don't know how to proceed in that direction, i'm not a kernel developer)

Proof Of Concept

from collections.abc import Mapping, ValuesView


class CopyOnWrite(Mapping):
    def __init__(self, data):
        self.__dict__['_diff'] = {}
        self.__dict__['_data'] = {}
        for k, v in data.items():
            self._data[k] = v if not isinstance(v, dict) else self.__class__(v)

    def __getitem__(self, key):
        if key in self._diff:
            return self._diff[key]
        return self._data[key]

    def __setitem__(self, key, value):
        if not isinstance(value, dict):
            self._diff[key] = value
        else:
            self._diff[key] = self.__class__(value)

    __setattr__ = __setitem__
    __getattr__ = __getitem__

    @property
    def original(self):
        res = {}
        for k, v in self._data.items():
            if isinstance(v, self.__class__):
                res[k] = v.original
            else:
                res[k] = v
        return res

    def __len__(self):
        return len(self._data | self._diff)

    def __iter__(self):
        return iter(self._data | self._diff)

    def values(self):
        # TODO: unsure about this, docs says a view should reflect the changes
        # TODO: does this work at deeper levels?
        values = self._data | self._diff
        for k, v in values.items():
            if isinstance(v, self.__class__):
                values[k] = v._data | v._diff
        return ValuesView(values)
        


if __name__ == '__main__':
    original = {
        'a': 'A',
        'd': {'z': 'Z'},
        'l': [1, 2, 3],
    }
    cow = CopyOnWrite(original)

    assert cow.original == original, cow.original
    
    cow.a = 'AA'
    assert cow.original == original, cow.original
    assert cow.a == 'AA'
    assert cow['a'] == 'AA'

    cow.d.x = 'X'
    cow.d.z = 'ZZZ'
    assert cow.original == original, cow.original
    assert cow.d.x == 'X'
    assert cow.d.z == 'ZZZ'

    cow.l.append(4)
    assert cow.original == original, cow.original
    assert cow.l[-1] == 4

    cow.n = 'new'
    assert cow.original == original, cow.original
    
    assert len(cow) == 4, len(cow)

    for i, n in enumerate(cow):
        assert n == ('a', 'd', 'l', 'n')[i]

    assert list(cow.keys()) == list(original.keys()) + ['n']

    assert list(cow.values()) == ['AA', {'x': 'X', 'z': 'ZZZ'}, [1, 2, 3, 4], 'new']

    assert dict(cow.items()) == dict(zip(list(original.keys()) + ['n'],
                                         ['AA', {'x': 'X', 'z': 'ZZZ'}, [1, 2, 3, 4], 'new']))

How to apply to the current Object/ObjectState/Tracker

I need to check.

@brondsem
Copy link
Collaborator

brondsem commented Jun 2, 2022

So at a high-level I know what copy-on-write means, but to be honest I'm not real sure how to evaluate this. You have a lot of test cases, which is good, but still its pretty complex and detailed. I wouldn't know if there's some other cases that need to be considered. Or if this has a performance impact.

To be honest, I'd rather not add this much more complexity into Ming. It seems like a lot of logic and nuance and possible performance impact (just a guess, no idea). I'll post back on the other merge request as well, about how we might proceed or not.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants