Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add commit_many() function to perform builk updates #384

Open
RoundelRider opened this issue May 12, 2023 · 1 comment
Open

Add commit_many() function to perform builk updates #384

RoundelRider opened this issue May 12, 2023 · 1 comment

Comments

@RoundelRider
Copy link

RoundelRider commented May 12, 2023

The Document object tends to get used in an inefficient manner when needing to change a value in multiple documents. For example, a developer may tend to use something like this to make multiple changes:

staff_cursor = Staff.find({'status': 'active', 'location': 'Springfield'})
for staff in staff_cursor:
    staff.notifications.append("Good job")
    staff.commit()

This can be accomplished by accessing the collections methods directly, but disregards the field validation and abstraction of the document class. For example, the same operation could be written as:

Staff.collection.update_many({'status': 'active', 'location': 'Springfield'}, {'notifications': {'$push': 'Good Job'}})

A better approach would be to enhance the DocumentImplementation for each database to include a commit_many(...) function, along with a new to_mongo_update_many(...) function in the BaseDataProxy object. Then the programmer could use a statement such as:

Staff(notifications=['Good Job']).commit_many({'status': 'active', 'location': 'Springfield'})

This allows the document objects to be used with field validation and allows the pre/post update functions to be called (for example to update a modification date/time field), and generates the correct $set and $push combination depending on field type.

The BaseDocument commit_many function could accept the following parameters:

def commit_many(self, conditions=None, array_update_method=ArrayUpdateMethod.ADD)
"""
Commit changes to multiple documents per conditions.
:param conditions: Only perform commit if matching record in db satisfies condition(s) (e.g. version number).
Raises :class:umongo.exceptions.UpdateError if the conditions are not satisfied.
:param array_update_method: ADD (default) to add array elements to document or REPLACE to replace fields containing arrays with the supplied value(s)
:return: A :class: pymongo.results.UpdateResult
"""

And the BaseDataProxy object to_mongo_update_many function has the following declaration:

def to_mongo_update_many(self, array_update_method=ArrayUpdateMethod.ADD)

The to_mongo_update_many() function checks the changed fields and builds the set, unset, and push lists as appropriate. If the array update method is REPLACE, then $set is used instead of $push for fields of type list. For subfields, the fields are referenced with dot notation to specify the sub-field in the set or push operation. For example, if we have a document that contains embedded dictionaries such as:

{'_id': 'abc123', 'type': 'comms', 'settings': {'lightspeed_enabled': True, 'channels': {'normal_traffic': 100, 'emergency': 911}}}

to change the 'normal_traffic' channel on all 'comms' records,
MyDoc(settings={'channels': {'normal_traffic': 101}}).commit_many({'type': 'comms'})
would produce the update set of:
{'$set': {'settings.channels.normal_traffic': 101}}

I've implemented this in my own application (by extending the Instance, Builder, and BullkDocuments specific to PyMongo and MotorAsyncIO) but this would be a lot simpler to implement directly in the uMongo code base and could be of use to others. I would be happy to implement this feature if approved.

@lafrech
Copy link
Collaborator

lafrech commented Jun 4, 2023

Hi.

Thanks for sharing your work on this.

You may embed code blocks here by surrounding them with back quotes. Adding "py" or "python" helps for syntax highlighting. I edited the top of your comment.

    ```py

    test = "test"
    ```

I don't use umongo anymore so I can't dedicate too much time to this. I should update AUTHORS file. Maybe @whophil will be interested in this contribution.

I find the API a bit strange, as you seem to be instantiating an object with the desired new attribute, then using it to apply a change to many documents.

# This is normal use
Teacher(name='John Buck', has_apple=False)

# This is not really a `Teacher` but a single attribute wrapped in a `Teacher` instance
Teacher(has_apple=True)

I'd be worried this could hit you in the back in some corner cases, although I don't have any example of such case.

Maybe I've been playing with SQLAlchemy too much and I forgot a bit about umongo but couldn't validation issues arise?

I agree about the current limitation (inefficient doc-by-doc or direct access to driver). Just unsure about this implementation.

Considering the development pace of umongo, it shouldn't be an issue to maintain your patch in user code, but sharing it here and working on a nice and comprehensive PR is greatly appreciated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants