Skip to content

WebSocket stats not updated correctly when sessions cleared #29375

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
daniel-naegele opened this issue Oct 25, 2022 · 6 comments
Closed

WebSocket stats not updated correctly when sessions cleared #29375

daniel-naegele opened this issue Oct 25, 2022 · 6 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) status: backported An issue that has been backported to maintenance branches type: bug A general bug
Milestone

Comments

@daniel-naegele
Copy link

daniel-naegele commented Oct 25, 2022

Affects:

  • Spring Boot v2.5.5 with spring-boot-starter-websocket, which uses
    • spring-messaging, 5.3.10
    • spring-websocket, 5.3.10
    • spring-boot-starter-web, 2.5.5

We are using spring boot with STOMP via SockJS and Websockets. We have app and website clients. However the Websocket Stats do not match the actual user count after a certain amount of time, here is an excerpt from our logs. You can already see at the third message, that there are more websocket sessions reporterd than there are sessions in general. If you look at the last message, this will go to extremes (573 actual sessions, 4791 reported websocket sessions).

I don't know if this issue also extends to Http Streaming or Polling, because we have very few clients using this method.

2022-10-18 21:08:04.381  INFO 1 --- [MessageBroker-1] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[177 current WS(175)-HttpStream(2)-HttpPoll(0), 182 total, 0 closed abnormally (0 connect failure, 0 send limit, 5 transport error)], stompSubProtocol[processed CONNECT(182)-CONNECTED(182)-DISCONNECT(0)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1107], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 369], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 174, completed tasks = 361]
2022-10-18 21:38:04.382  INFO 1 --- [MessageBroker-2] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[167 current WS(166)-HttpStream(1)-HttpPoll(0), 263 total, 0 closed abnormally (0 connect failure, 0 send limit, 44 transport error)], stompSubProtocol[processed CONNECT(263)-CONNECTED(263)-DISCONNECT(8)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1890], outboundChannel[pool size = 3, active threads = 0, queued tasks = 0, completed tasks = 1974], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 164, completed tasks = 12065]
...
2022-10-18 22:38:04.383  INFO 1 --- [MessageBroker-4] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[138 current WS(140)-HttpStream(1)-HttpPoll(0), 333 total, 3 closed abnormally (3 connect failure, 0 send limit, 97 transport error)], stompSubProtocol[processed CONNECT(328)-CONNECTED(328)-DISCONNECT(10)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 2571], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 4922], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 135, completed tasks = 32608]
...
2022-10-24 07:38:04.475  INFO 1 --- [MessageBroker-4] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[573 current WS(4791)-HttpStream(1)-HttpPoll(2), 56490 total, 153 closed abnormally (153 connect failure, 0 send limit, 26430 transport error)], stompSubProtocol[processed CONNECT(52917)-CONNECTED(52913)-DISCONNECT(5201)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 484905], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 937815], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 569, completed tasks = 6611570]
@spring-projects-issues spring-projects-issues added the status: waiting-for-triage An issue we've not yet triaged or decided on label Oct 25, 2022
@rstoyanchev
Copy link
Contributor

rstoyanchev commented Nov 1, 2022

573 actual sessions, 4791 reported websocket sessions

This is strange. The first is the size of sessions map in SubProtocolWebSocketHandler, and the second is a counter that gets incremented/decremented as sessions are added/removed. They are always updated together, in afterConnectionEstablished and in clearSession. I don't understand how the two can get out of sync.

I don't know if this issue also extends to Http Streaming or Polling, because we have very few clients using this method.

It does seem like there is a fair number of transport errors (connection dropped). Also those "closed abnormally" means that a WebSocket connection can be established but no messages came which could be a proxy issue. In cases a WebSocket connection can't even be established, it should fall back on HTTP transports, and I would have thought those polling/streaming session numbers should be higher.

@daniel-naegele
Copy link
Author

We use traefik as our proxies and have no problems connecting via websocket. The http status code stats do not indicate any abnormal behaviour regarding the proxying of requests. The "connect failure" stats are probably from users behind coporate proxies, which remove the Upgrade Header from all requests, which we know exist (a few).
I guess the huge number of transport errors are rooted in the long uptime and that we have app clients, which can switch networks, be in a car, ...

@rstoyanchev rstoyanchev added the in: web Issues in web modules (web, webmvc, webflux, websocket) label Jan 24, 2023
@rstoyanchev rstoyanchev self-assigned this Feb 8, 2023
@rstoyanchev
Copy link
Contributor

Okay, I finally see where the issue is. We keep stats by WebSocket session type (websocket, stream, poll), and therefore need to locate the right counter by session type. However, the session is decorated after it is established, and when it is closed it must be unwrapped. This works as expected when the session is closed normally since ConcurrentWebSocketSessionDecorator calls the delegate. However, in case of SessionLimitExceededException, when the session is not websocket, we fail to find the right counter.

@rstoyanchev rstoyanchev added this to the 5.3.26 milestone Feb 8, 2023
@rstoyanchev rstoyanchev added type: bug A general bug and removed status: waiting-for-triage An issue we've not yet triaged or decided on labels Feb 8, 2023
@rstoyanchev rstoyanchev changed the title Websocket Stats are not matching the actual user count WebSocket stats not updated correctly when sessions cleared Feb 8, 2023
@rstoyanchev
Copy link
Contributor

On closer look at the stats again , I don't think there are limit exceeded exceptions, or they would show as "send limit". In any case, I did find a problem, and will fix it. I'm not sure what else to do here without a way to reproduce it.

@rstoyanchev rstoyanchev modified the milestones: 5.3.26, 6.0.5 Feb 8, 2023
@github-actions github-actions bot added status: backported An issue that has been backported to maintenance branches and removed for: backport-to-5.3.x labels Feb 8, 2023
rstoyanchev added a commit that referenced this issue Feb 8, 2023

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
See gh-29375
@rstoyanchev
Copy link
Contributor

Closing this for now. You can try with 5.3.26 once available.

@daniel-naegele
Copy link
Author

Hey, we are still experiencing this issue with Spring Boot 3.0.6, which uses spring-messaging and spring-websocket with version 6.0.8. Here is an excerpt from our logs:

023-05-11T14:50:29.682Z  INFO 1 --- [MessageBroker-4] o.s.w.s.c.WebSocketMessageBrokerStats    : WebSocketSession[928 current WS(2983)-HttpStream(3)-HttpPoll(6), 125916 total, 483 closed abnormally (483 connect failure, 0 send limit, 56579 transport error)], stompSubProtocol[processed CONNECT(120381)-CONNECTED(118726)-DISCONNECT(2413)], stompBrokerRelay[null], inboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1089279], outboundChannel[pool size = 8, active threads = 0, queued tasks = 0, completed tasks = 1467036], sockJsScheduler[pool size = 4, active threads = 1, queued tasks = 907, completed tasks = 3746831]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) status: backported An issue that has been backported to maintenance branches type: bug A general bug
Projects
None yet
Development

No branches or pull requests

3 participants