Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async transactions #1099

Open
astigsen opened this issue Jun 27, 2017 · 17 comments · May be fixed by #6552
Open

Async transactions #1099

astigsen opened this issue Jun 27, 2017 · 17 comments · May be fixed by #6552

Comments

@astigsen
Copy link
Contributor

astigsen commented Jun 27, 2017

Right now all write transactions are done synchronously, which is obviously not optimal as they end up blocking other events from taking place while the transaction is in progress (which involves IO).

The reason that transactions can not just be done asynchronously on background threads is that they involve javascript code (the actual code changing the data), so that code has to run on the main javascript thread. But doing a transaction is actually composed of multiple steps, some of which could done asynchronously:

  1. Advance to latest version (can be done async)
  2. Do changes (has to be done in javascript on the main thread)
  3. Commit to disk (can be done async)

Advance to latest version
Before starting a transaction you always have to update the state of the entire realm to the latest version. If you did not do this, the actual code being done in the transaction would risk working on stale data.

This is a potentially blocking operation, as another transaction may be in progress, so you have to wait for that to complete before you can advance to the latest version. But this waiting can be done asynchronously on a background thread, so that the javascript event loop can continue doing other work while waiting for the for the advance to complete.

Do changes
This is javascript code, so it has to run on the main javascript thread.

Commit to disk
This can again be done on a background thread so that the event loop can progress doing other stuff while it completes.

Moving these parts of the transaction process to background threads would not affect the throughput if the main constraint is transactions on the same realm (since the transactions would still have to wait for each other to complete), but it would free up the event loop to do other work while the transactions are in process, and if you are working with multiple realms they could also do transactions in parallel.

@astigsen
Copy link
Contributor Author

astigsen commented Jun 27, 2017

Note that this is different from the concept of async transactions in realm core, which is transactions that commit in a background thread but returns immediately and allows other transactions to start even if the previous one that not been fully committed to disk yet (meaning that while it is corruption safe, there is a possibility to loose transactions not yet on disk in case of a crash).

This proposed way of doing async transactions is only async from the javascript perspective, so it is still guaranteed to be crash safe.

@bigfish24
Copy link
Contributor

@blagoev do you have any opinion on this?

@blagoev
Copy link
Contributor

blagoev commented Aug 16, 2017

In general all of the proposed is valid. We should strive to free the node event loop. I can't say on the implementation part how much time it will take to do it. Also we should measure what is the impact if any on the current implementation and how much we gain of making disk commit and updating the version to be async. If we have an exact scenario to measure against will help.

Also there is work in progress that is kinda related to this in here: #1216
which may help measure transaction begin and commit times.

@ivanquirino
Copy link

ivanquirino commented Sep 30, 2017

I would like to
realm.write() .then(() => { realm.create(data) .then((newObject) => {} });

so this way I can use async/await with no synchronous create()

@mfbx9da4
Copy link

mfbx9da4 commented Mar 2, 2022

@astigsen

The reason that transactions can not just be done asynchronously on background threads is that they involve javascript code (the actual code changing the data), so that code has to run on the main javascript thread

Couldn't you extract the write transaction callback code at build time and execute even the JS code on a separate worker thread? Similar to how re-animated worklets and react-native-multithreading work

// JS thread
spawnThread(() => {
  'worklet'
  // custom thread
  // expensive calculation
})
// JS thread

@samikarak
Copy link

So is this impossible to solve or is anyone sitting on this issue? (Since the issue is 5 years old)

@kraenhansen
Copy link
Member

@mfbx9da4, @whalemare & @samikarak: I think what we need is a compelling reason to investigate this further.

I'd love to understand if you're experiencing the current (blocking) implementation of Realm#write as slow for your use-cases. If yes: I'd love to learn more about those - how many objects are you creating and how complex (# properties, # relationships, etc.) are they? Are you seeing performance issues with / without sync, etc.

@mdtechcs
Copy link

I have a use case when I want to confirm if database update happened and then want to perform some action. so is there any event or callback when real.write is executed

@whalemare
Copy link

whalemare commented Apr 23, 2022

@kraenhansen we are developing offline-first application, that allow to delivery parcels without stable internet connection.

Each day, our backend service generate around 100 delivery tasks and around 16k static tariffs that we should download and display to user asynchronously, when it available, without blocking UI.

With one-by-one sync realm.write our JS thread is fully blocked and we can't write all 16k tariffs as a single transaction.

But, with some realm.writeAsync('Tariff', tariffArray) we want to move this write transaction to another thread to free JS thread and no block user interaction

Our Tariff schema has small set of properties and have no relationships

export interface Tariff {
  code: string
  name: string
  urgency: boolean
}

@samikarak
Copy link

@mfbx9da4, @whalemare & @samikarak: I think what we need is a compelling reason to investigate this further.

I'd love to understand if you're experiencing the current (blocking) implementation of Realm#write as slow for your use-cases. If yes: I'd love to learn more about those - how many objects are you creating and how complex (# properties, # relationships, etc.) are they? Are you seeing performance issues with / without sync, etc.

Well In our use-case we have a lot of functions with successive updates/creations of different realm objects where in-between we need to call some api endpoints and wait for the response. If the API Response returns an error we want to be able to rollback the realm updates we did. For us it is more about the consistency of the realm objects when an error occurs on async api calls.

it is not always possible to decouple realm logic from othe async api logic. They sometimes depend on each other.

@stefoid
Copy link

stefoid commented Mar 7, 2023

@kraenhansen Great to hear you are interested in real world use - we are currently experiencing terrible performance issues with our react native project.

Id like to give some reports on performance issues with our RN app, using rn 71.3 and realm 11.5.1 (hermes enabled) - these issues are much worse for android than iOS

Our app listens to a lot (8 or 10) 'live' filtered results sets to keep lists and other UX elements reactive to changes in the database.

When there are only a few hundred records in the tables corresponding to these lists, everything runs acceptably
But when there are 5K+ records in these tables, the app slows to a crawl. the lists lists draw slowly as the user scrolls, and if small writes are performed, the app pauses for seconds at a time!

Ive tried closing most of these live result sets (effectively turning off certain large portions of the UI for test purposes) and the performance of the app improves dramatically.

So the summary is :
not much data in the DB and lots of live results sets = OK
lots of data in the DB and few (2 or 3) live results sets = OK
lots of data in the DB and lots of live results sets = BAD

I really need to do something about this, and I need some more insight into the above so that I can come up with a decent solution for right now, and in the future.

Without knowing what is going on under the hood, it appears to me as there is a a heavy load associated with servicing live results - and that is proportional to 1) the number results sets 2) the size of the tables and 3) the complexity of the results filters (all of which which makes sense, but its the size of the load and how it impacts the app which is the issue)

The end result for us is a delay of seconds where the app is unresponsive when making even small writes to the DB. To me, this indicates that the synchronous API is forcing the app to wait not only for the write transaction, but my guess is also that it forces the app to wait for live result sets to be updated. If so, this needs to be changed - the app needs writes to be processed promptly (synchronous API) so as not to freeze it -- updating results sets to reflect the new state of the DB can be done later. It would be great if that didnt take seconds, but if it must, at least the app would be usable while it happened. I realize that the user of the API would expect that after having made a synchronous write, the results of that write would be available in the results set. So the answer seems to be to make the write transaction itself asynchronous. That is - 1) ask for the write 2) return instantly so the app is responsive 3) get a callback later when the results set has been updated successfully with the result of the transaction.

As for the observation that simply having live result sets open also makes reads sluggish (and hence scrolling through our lists annoyingly slow) .. well I cant account for that, but thats how it feels. Is there any reason why this should be the case?

This performance issue has crept up on us over time as we have embraced and leant into Realms promise of updates in the UX through the use of live results, and our real world users have built up larger and larger datasets over time. And this performance issue has reached a head now with updating to the latest hermes version of realm where general performance seems 2x worse (previously we were on 10.19.0) Angry users are contacting us with reports that the app is unusable.

My task right now is somehow to refactor the app to get performance back to an acceptable level. I need a better understanding of how realm works so I can make the right short term changes to the app. Any insights and suggestion gladly accepted!

thanks realm team!

@tgoyne
Copy link
Member

tgoyne commented Mar 8, 2023

A profile of your app while it's hitting the multi-second pauses could be interesting. There's a pretty good chance that either you're hitting a scenario where keypath filtering could be very beneficial, or you're hitting something which really needs some optimization attention (or both). Our general expectation is that data sets large enough to cause that big of delays would be too large to fit on a phone in the first place.

Are you using the change information passed to result listeners? The delay here is most likely that while we normally update query results and call listeners and such asynchronously, beginning a write transaction blocks until that's all complete. This is required for the change information passed to the listener to be meaningful rather than doing things like telling you that index 5 was updated even though the collection now only has two elements in it. However, if you're just triggering a refresh of that part of the UI and discarding the change information, then doing that is unnecessary.

@stefoid
Copy link

stefoid commented Mar 8, 2023

Hi. Definately we are in the vicinity of seconds, its very much dependent on the users device. On my very crappy android phone, its 5-10 seconds, where on an average android device the same trivial write operation might take 1-2.

By key path filtering I guess you mean this https://www.mongodb.com/developer/products/realm/realm-keypath-filtering

I am not using the change information itself, just the fact that something has changed to tell the UI to refresh

If a flag could be passed to the write transaction to tell it to return without waiting for asynchronous updating result sets, I would use that for sure.

@stevenmathers
Copy link

Question: Write transactions take longer to return from proportional to how many 'live' results sets there are, even though those results sets have no relationship with the records being written?

I tested this in our app at a certain point that was convenient where we write a couple of records to a table to confirm - with results sets enabled / disabled and there was significant difference in the write transaction returning, even though was being written had no relationships with other tables that featured in results sets.

So it seems like realm processes all results sets when a write transaction is made , regardless of schema - is that the case?

@nirinchev
Copy link
Member

I am not using the change information itself, just the fact that something has changed to tell the UI to refresh

Is it possible you're blocking on the UI refresh call? Have you tried triggering the UI refresh asynchronously (e.g. after 1 ms delay)? As @tgoyne said, if you can provide us with a profile of your app, that'll be helpful, otherwise a minimal project that reproduces the issue would also work.

@stefoid
Copy link

stefoid commented Mar 9, 2023

its definately the blocking writes.

Ill see about a profile

@bimusiek
Copy link
Contributor

We faced the same issue. The more the app UX got complicated with live views and more objects we have added to the Realm, the realm.write took more and more time to finish and return.

While monitoring performance of realm.write, basically by using performance.now(), after few seconds that was the result:

Perf[matchItem:realm.write]:  121230.945708` // thats in ms

Then, by patching one thing (but not yet fixing the root problem):
CleanShot%202024-03-14%20at%2020 03 53@2x
And that was after minutes.

Basically, every time listener updates in useQuery and useObject it creates shitload of objects and new listeners that stays in memory. So my theory is that it is creating new listeners constantly which slows down realm.write.

I will be updating this issue with my further findings.

@bimusiek bimusiek linked a pull request Mar 14, 2024 that will close this issue
7 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.