Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Django's async ORM API #2090

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bigfootjon
Copy link
Contributor

The async ORM interface was added in 4.1! Since channels supports >=4.2 we can document that new API.

Consider this a first pass, I just updated the spots I noticed that looked like they needed tweaking, happy to take feedback on wordsmithing or different ideas on how to document this.

Copy link
Member

@carltongibson carltongibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @bigfootjon, yes, so good idea.

It's not quite as simple as folks just switching to use the async interface I think, because database_sync_to_async also clears up old database connections (before and after the database interaction) mimicking Django's request life-cycle signal handlers (for request started and request finished).

We can adapt the advice here for Async Consumers.

For short-lived consumers we could say something like, "do that too!".
I need to have a think about longer-lived consumers, where is not 100% clear you want to hold an open DB connection the entire time... 🤔
(Need more coffee to think about that ☕️)

@bigfootjon
Copy link
Contributor Author

Maybe I should slightly tweak this PR then: introduce an async “clean_up_db_connections” decorator and then document that

@carltongibson
Copy link
Member

@bigfootjon I'm not 100% sure what we should say here.

In theory you have one connection per thread, and in the pure async case you have just one thread.

So, why not hold onto the connection? (Likely you'd need a way to recover from connection errors, and such but as a first pass...)

Trouble is, we're still relying on the sync core underneath the async ORM API, so we're still going to be wanting to clean up connections. Without building some test projects I can't say exactly when and what is going to be needed there.

I think that's what makes it interesting 😜

@bigfootjon
Copy link
Contributor Author

Makes sense. I’ll noodle on this for a bit and see what I can figure out (I only really have time to work on Django in my spare time, and I’ll be a bit limited on spare time for the next few weeks)

@carltongibson
Copy link
Member

@bigfootjon Slowly, slowly is the Django way. No stress! 😉

@bigfootjon
Copy link
Contributor Author

I've done a bit of tinkering and a bit of thinking and I think I've come up with a solution I'm happy with (though I'm eager to hear competing opinions).

Theory

Conceptually, the purpose of cleaning up connections is a necessary piece of the machinery of Django due to how connection health works within Django:

At the beginning of each request, Django closes the connection if it has reached its maximum age. If your database terminates idle connections after some time, you should set CONN_MAX_AGE to a lower value, so that Django doesn’t attempt to use a connection that has been terminated by the database server. (This problem may only affect very low traffic sites.)

ref

Based on my understanding, there are no "timers" within Django to periodically check connection health. Connection health is ensured at the start (and end) of a connection.

So for channels, I think we want to clear connections at the beginning of a request (connect), whenever a receive event happens before we dispatch to the consumer, and when the consumer finishes (disconnect). This provides the experience that users expect: if they use the ORM (and expect a working or fresh database connection) in a consumer it will either be at connect time or in response to new input from the client (receive) or the connection will be closing and they won't care.

The other time they would need a database connection would be at send time. Some event will be triggered on the server and will wake up some logic (channel layer or otherwise) that will result in a send. In this case, we should not be trying to auto-close connections because the user might be making a series of queries and send in sequence in response to some server-side event.

As a convoluted example, there might be changes to a server-side "file system" stored in Django Models where each Folder stores a ForeignKey ref to its parent folder. In response to some server-side event, the client would need updates on all folders in a path. As an implementation detail, the server might implement this as (pseudo-python):

folder = ...
while folder is not None:
  folder = await Folder.objects.aget(pk=folder.parent_id)
  await self.asend(...)

In such a configuration closing the connection after every aget/asend pair would kill performance and not be what the user expected. They look at this block of code and think "this is all happening within the same 'request'".

To summarize, connect and receive occupy the "request started" side of the conceptual request, but send does NOT align with the "request finished" event. Only disconnect aligns with "request finished". Therefore, I think we should close connections before connect and receive and after disconnect in a new Middleware (DjangoDBMiddleware) or implicitly within the framework (and remove the current database_sync_to_async stuff).

CONN_MAX_AGE

I think there's a problem with the solution outlined above: connections usually have a bounded lifetime.

If a connection runs for forever the database connection might hit its CONN_MAX_AGE. In these events we should close the connection and make a new one. This would be handled automatically in the above life-cycle events except send.

Imagine a scenario where a websocket is set up for client-side notifications of mutations to objects stored in the ORM. The developer has configured a consumer that does nothing on connect or receive, but every once in a while it wakes up (via an MQTT notification or something) and sends down the objects that it was notified about. Let's say there's some database query that must be done before sending down the object (like "is this content visible to the current user" checks) but these MQTT notifications are rare (1 hour or more). In this case, if the CONN_MAX_AGE on the database connection is less than an hour then we've violated the expectations of users by keeping their connections open longer than they specified.

The implicit user contract in the above docs is that a connection will not be used after CONN_MAX_AGE. (ok ok I suppose that the max age could be reached during a traditional django request, but since traditional http requests are so fast I'm ignoring that).

I don't really know how to solve this. I can see two solutions:

A. Close Connections After First Send

What if we close connections after the first send in a "while" (maybe by comparing the time the last send and the current send are happening)? This could be close enough to what users expect but could have some problems.

The biggest problem is that if the connection is invalid that wouldn't help users as the first query they make would fail and an exception would be raised, which goes against user expectations ("I thought Django managed this for me"). The next biggest problem is that it has bad performance characteristics. If an user has a consumer that sends 2 send messages immediately after one another and then pauses for a "while", this solution would slow down the second send but not the first as django would tear down the connection after the first send and then build a new one before the second send. That would again go against user expectations (and lead to a lot of head scratching).

I don't love this solution tbh.

B. Document It

I think in the send case we can't provide a good general-purpose solution implicitly for all users. Instead, we should provide the next best solution: documentation.

We can define a function (in channels or django core) that closes connections from async code and then document the contract that users need to uphold for themselves. The thesis of this documentation can be: call close_connection as early as possible in the code paths that eventually call send, particularly in long-lived connections.

Summary

I'm vastly in favor of (B) and the implicit cleanup in connect/receive/disconnect, and would be interested in putting in the work here to make it so.

@carltongibson
Copy link
Member

Hey @bigfootjon -- thanks for putting your brain to it. Makes sense.

I too lean toward B. If I've got a very long lived connection, I'm happy that it's my job to run a periodic task to clean up the connection object. Makes sense to me that way.

@bigfootjon
Copy link
Contributor Author

Alright I've made a pass at integrating my ideas into Channels and documenting them!

@carltongibson
Copy link
Member

Hey @bigfootjon. I managed to get a look at this. Seems sound. I want to play with it a bit more but generally positive.

Thinking currently about the migration path here. Say I'm currently using the existing wrapper, I guess we'll want to say "migrate, but the extra calls to close old connections won't hurt"? 🤔

@bigfootjon
Copy link
Contributor Author

bigfootjon commented May 3, 2024

I guess we'll want to say "migrate, but the extra calls to close old connections won't hurt"? 🤔

Yeah I think that’s reasonable. My reading of the close_old_connections code suggests it’s pretty cheap to call if it does nothing.

In any case, using the async versions of the ORM methods will be slightly faster in certain cases due to the ORM being able to sometimes hit caches before calling in to the sync DB internals, so there’s multiple incentives to use the newer method.

@carltongibson
Copy link
Member

Hey @bigfootjon -- I'm thinking this is a point release, so just working my way round to that. Not forgotten 😉

@carltongibson
Copy link
Member

Hey @bigfootjon — I'm looking at #1091 in relation to this... — problem there is that code wrapped in database_sync_to_async closes DB connections in test runs.

(Adam has a mocking work around, making it a no-op)

Fancy reading over that? Is it likely something exacerbated by what we're proposing here? It seems so... 🤔 I've had it in my notifications for a couple of weeks, but realise I should mention it to you, rather than sitting there not actually getting to look at it. 🤹

@bigfootjon
Copy link
Contributor Author

Yeah I can take a look! (Sometime over the next few days)

@carltongibson
Copy link
Member

No stress! 😅

I think we may need to tighten up the story there as part of this progress. Have a look, let me know what you think.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants