Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support https://github.com/maxcountryman/tower-sessions #27

Open
LeoniePhiline opened this issue Sep 27, 2023 · 15 comments
Open

feat: Support https://github.com/maxcountryman/tower-sessions #27

LeoniePhiline opened this issue Sep 27, 2023 · 15 comments

Comments

@LeoniePhiline
Copy link
Owner

LeoniePhiline commented Sep 27, 2023

Add support for tower-sessions. (Announcement: maxcountryman/axum-sessions#20 (comment))

Consider keeping axum-sessions as default (yet deprecated) feature for now, and change defaults in a later release.

@maxcountryman
Copy link

I haven't looked carefully enough to say with great confidence, but I wonder if the CSRF pattern could be implemented entirely as an extractor a la the counter extractor example. Or for more complex state management, this can be built upon a la the strongly typed example.

I think that might obviate the need for a separate middleware.

@LeoniePhiline
Copy link
Owner Author

LeoniePhiline commented Oct 2, 2023

A middleware is commonly used for these kinds of things, and for good reason:

It's very inconvenient to use an extractor, as

  • you cannot apply an extractor to a group of routes at once, and
  • an extractor requires you to clutter your handler parameters with an additional argument you will never use.

@maxcountryman
Copy link

I agree that an extractor alone is not a good interface for this.

Fortunately extractors can be converted to middlewares quite easily. For that reason, I've come to start to appreciate extractors as a more fundamental building block.

Perhaps it's still not a good fit here, but thought I'd mention it nonetheless. 🙂

@LeoniePhiline
Copy link
Owner Author

LeoniePhiline commented Oct 8, 2023

@maxcountryman

I'd really need access to the session data lock. In the CSRF middleware implementation you can see, why:

The "load early, overwrite later" pattern proposed by tower-sessions does not deal well with situations where the same user has concurrent requests to the server (a very common case). The middleware might find a user having no session, regenerate the key, and save it for the response. At the same time in a task running on another thread, a different token is generated. You can't control which one wins, and the next request might try use an immediately outdated token.

Do you see a possibility to add "user land" access to the Inner session mutex to tower-sessions?

(Edit: This could also be solved by providing an atomic read-and-update method accepting and invoking a closure passed in by the library user, while locking and unlocking is kept in the library's hands.)

(Without it, I can't migrate without severely diminishing the middleware's reliability in multithreaded concurrency contexts or such singlethreaded concurrency contexts where critical session data handling code yields to the runtime.)

@maxcountryman
Copy link

Do you see a possibility to add "user land" access to the Inner session mutex to tower-sessions?

What do you mean by this?

The middleware might find a user having no session, regenerate the key, and save it for the response. At the same time in a task running on another thread, a different token is generated

This doesn't sound right. Do you have a test case that can demonstrate this?

@LeoniePhiline
Copy link
Owner Author

You can inspect the very specifically chosen lifetime of the SessionHandle's acquired write lock held starting here: https://docs.rs/axum-csrf-sync-pattern/latest/src/axum_csrf_sync_pattern/lib.rs.html#477

It must remain locked as long as the presence and validity of the CSRF token and the possible need for its regeneration are sorted out.

This makes the CSRF manipulation atomic in regard to concurrent requests of the same user.

This would not be possible to implement using the current tower-sessions.

@maxcountryman
Copy link

Unfortunately that specific pattern also leads to deadlocks: consider cases where a middleware like this is used alongside other session use cases (auth, etc)--if the CSRF implementation holds a lock for the duration of the request, the session can't be used elsewhere. This is actually an outstanding problem with axum-sessions today.

Also I would point out that once a session is not empty (meaning some value has been put in it and Set-Cookie has been sent to the client), it's stable for the duration of its lifetime. In other words, it's not possible to send competing Set-Cookie headers. So a different approach here might be to manage a value within the session as opposed to creating and recreating sessions across multiple requests.

Can you provide a concrete test case? It would help us ensure we can provide the functionality you need.

@LeoniePhiline
Copy link
Owner Author

LeoniePhiline commented Oct 8, 2023

Unfortunately that specific pattern also leads to deadlocks: consider cases where a middleware like this is used alongside other session use cases (auth, etc)--if the CSRF implementation holds a lock for the duration of the request, the session can't be used elsewhere.

But it doesn't. It drops the lock after the critical part and before calling the inner service.

Please be sure to have inspected the section of code we are discussing.

In other words, it's not possible to send competing Set-Cookie headers. So a different approach here might be to manage a value within the session as opposed to creating and recreating sessions across multiple requests.

This is not about competing Set-Cookie headers, but about competing CSRF tokens, whose server side half is stored in the session.

Please have a look at the source code link: If you reason about how you'd translate it to tower-sessions then you should start seeing which critical part must be atomic per user (= per session). The non-atomic get & set operations of tower-sessions make it impossible to prevent other threads from writing session data in between the local thread's read op and write op.

The session data can have been changed by another thread or task between the local read and the local update, thus overwriting data based on outdated information. This cannot be solved without deliberate locking.

You'll have similar issues in your "count the user's page visits" example. In a high concurrency situation, your count will become incorrect, as your "read the old value" and "save the incremented value" steps are not atomic.

Two threads can load the same value and write that value + 1 into the session. The correct new count would have been + 2.

@maxcountryman
Copy link

maxcountryman commented Oct 8, 2023

I'm happy to help and the best way to do that will be with a concrete test case. Can you provide one? Ideally one we can add to the tower-sessions test suite.

The non-atomic get & set operations of tower-sessions make it impossible to prevent other threads

This is why replace_if_equal exists.

Again, I really need your help with a test case. 🙂

@LeoniePhiline
Copy link
Owner Author

LeoniePhiline commented Oct 8, 2023

I can't right now - I'm on my mobile. Hopefully I can help you understand the concurrency issue by helping your reason about it.

Have a look at your strongly typed example:

The Guest session data struct is extracted. At this point the session is shortly locked and read, then unlocked.

The task may or may not yield to the runtime before continuing. In the mean time, another concurrent task may be scheduled to run, which may write to the session, incrementing the page view count.

However, the guest struct loaded by the first task will not be notified or updated. It now contains stale data.

Later your first task is polled again and updates the session based on its stale page view count. It would have needed to refresh its Guest data struct, but it does not.

Now you might think you can solve this by refreshing the locally deserialized session data right before the update operation.

However, this only shortens the window during which other tasks may interfere. Since the session is not locked, the writing by other tasks in between the local task's read and write ops cannot be prevented. To fix this, you need to expose locking to the library user.

The problem with the old extension was that users were meant to extract write or read guards, which resulted in write locks being held for the duration of the request handling.

The correct solution have been not to prevent users from locking, but to instead extract a session handle (as I do in the middleware) which allows acquiring and releasing the session lock as needed.

@LeoniePhiline
Copy link
Owner Author

This is why replace_if_equal exists.

The strongly typed example should then use replace_if_equal and properly handle the case in which false is returned.

@maxcountryman
Copy link

The strongly typed example should then use replace_if_equal and properly handle the case in which false is returned.

I agree that the examples could be more comprehensive. Happy to make updates to those.

Also we may want to expose additional interfaces to make this easier, so your use case could help inform what that should look like. CAS is a necessary primitive to address data races but I'm happy to expand on that.

More broadly, the point of a library like tower-sessions or tower-cookies is to provide the management of the inner locking mechanisms. It's very possible that we haven't done that to the degree needed and therefore we need to fix that by expanding the interface. However, what I want to avoid doing is exposing the lock directly, because then tower-sessions cannot manage the potential for deadlocks.

@LeoniePhiline
Copy link
Owner Author

LeoniePhiline commented Oct 8, 2023

I agree that the examples could be more comprehensive. Happy to make updates to those.

I believe what you will see is that the replace_if_equal pattern forces you to perform your calculations (e.g. page view count + 1) in a loop if the underlying session data has changed in the mean time. This could be problematic if the calculation is nontrivial.

It would be cleaner if you provide a closure-based atomic read-process-update operation, which does the locking and unlocking for the user (thus not exposing a lock), but holds the lock during necessary calculations in the closure body (thus preventing races).

Such an approach - other than CAS - would ensure the calculation of a new session-stored value based on the previous value would never need to be repeated, as the underlying session-stored value cannot be externally changes while locally calculating the new value.

There should be sync and async versions. (Calculations might require database access, other I/O or acquiring other locks, etc.)

Note that an API with support for closures returning a Future still allows users to program deadlocks, but that can only remain the developer's responsibility.

@maxcountryman
Copy link

maxcountryman commented Oct 10, 2023

So we can update the service implementation such that it tracks a hash set of loading sessions. When a session ID is in the set, we don't load from the store and wait until it's no longer in the store. When it's not in the set, it's safe to load, there can be no data race. At the end of each request we remove the session ID from the set, allowing waiting tasks to make progress. This approach means we don't need replace_if_equal. The downside is for a given session, only one task can make progress on it at a time. It's certainly a more straightforward API, but the performance drawbacks seem pretty significant.

As I mentioned, I'm open to expanding on our API. It would be nice to build on primitives like CAS. For instance I can imagine strongly-typed.rs could make use of it as such:

diff --git a/examples/strongly-typed.rs b/examples/strongly-typed.rs
index 4d3c32d..4d06f82 100644
--- a/examples/strongly-typed.rs
+++ b/examples/strongly-typed.rs
@@ -57,13 +57,31 @@ impl Guest {
 
     fn mark_pageview(&mut self) {
         self.guest_data.pageviews += 1;
-        Self::update_session(&self.session, &self.guest_data)
+        Self::update_session(&self.session, |mut guest_data| {
+            guest_data.pageviews += 1;
+            guest_data
+        })
     }
 
-    fn update_session(session: &Session, guest_data: &GuestData) {
-        session
-            .insert(Self::GUEST_DATA_KEY, guest_data.clone())
+    fn update_session<F>(session: &Session, data_updater: F)
+    where
+        F: Fn(GuestData) -> GuestData,
+    {
+        let mut current_value: GuestData = session
+            .get(Self::GUEST_DATA_KEY)
             .expect("infallible")
+            .unwrap_or_default();
+
+        while let Ok(false) = session.replace_if_equal(
+            Self::GUEST_DATA_KEY,
+            current_value.clone(),
+            data_updater(current_value.clone()),
+        ) {
+            current_value = session
+                .get(Self::GUEST_DATA_KEY)
+                .expect("infallible")
+                .unwrap_or_default();
+        }
     }
 }
 
@@ -91,14 +109,22 @@ where
     async fn from_request_parts(req: &mut Parts, state: &S) -> Result<Self, Self::Rejection> {
         let session = Session::from_request_parts(req, state).await?;
 
-        let mut guest_data: GuestData = session
+        let mut guest_data = session
             .get(Self::GUEST_DATA_KEY)
             .expect("infallible")
-            .unwrap_or_default();
+            .unwrap_or_else(|| {
+                let guest_data = GuestData::default();
+                session
+                    .insert(Self::GUEST_DATA_KEY, &guest_data)
+                    .expect("infallible");
+                guest_data
+            });
 
         guest_data.last_seen = OffsetDateTime::now_utc();
 
-        Self::update_session(&session, &guest_data);
+        session
+            .insert(Self::GUEST_DATA_KEY, &guest_data)
+            .expect("infallible");
 
         Ok(Self {
             session,

@LeoniePhiline
Copy link
Owner Author

LeoniePhiline commented Nov 2, 2023

So we can update the service implementation such that it tracks a hash set of loading sessions. When a session ID is in the set, we don't load from the store and wait until it's no longer in the store. When it's not in the set, it's safe to load, there can be no data race. At the end of each request we remove the session ID from the set, allowing waiting tasks to make progress.

Such an implementation would equal wrapping the session data in a mutex and holding the mutex for the entirety of the request.

The API that is missing from tower-sessions is not that. Instead, the missing API is something that allows library users to lock and unlock session data for a subset of the request runtime.

You cannot atomically set non-trivial data structures (imagine a String as a simple example). That's why you want to use compare-exchange APIs. However those require re-doing a possibly expensive operation if the compare failed and the exchange must be prepared a second or Nth time.

The way around this is to lock a single user's session data for the duration of the expensive calculation, then set the value and unlock.

Your example uses a simple counter whose calculation can cheaply be repeated N times until the compare succeeds and the exchange can be applied.

(Although for a counter you would rather want to use an AtomicUsize, wouldn't you?)

The second example is still unsound, and any even more complex session data / even more complex to calculate values are not considered. This is also true for the diff you posted above above.

Imagine a user needs something set every now and then which takes hundreds of ms or even a second to calculate or generate. (In your above diff, that would be the data_updater arg Fn.)

Without exposing locks to users, these hundreds of ms or even seconds might be spent concurrently by multiple requests, only to then be repeated, possibly multiple times until the compare-exchange succeeds.

Rwlocks and Mutexes exist for a reason. Please expose them.

The problem of axum-sessions was exposing locked guards in extractors, leading to deadlocks.

The correct fix is to instead expose unlocked handles which can be locked and unlocked by the library user as they see fit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants