Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrong order of messages in process specific channel if receive call was cancelled. #151

Open
sergey-komissarov opened this issue Apr 4, 2019 · 2 comments

Comments

@sergey-komissarov
Copy link

Hello, we are using group to send notifications from channels worker process to the websocket consumer in the server process. There is exactly one worker in the group and exactly one consumer on the server side. Sometimes messages order is changed on the server side.

Detailed investigation shows the following events on the server side:

  1. For the channel specific.innxCCgU!NcMqIZztkRQJ receive_lock acquired and receive_single started.
  2. Before this call core.py#L483 the redis channel messages list had 3 messages and backup list was empty:
asgi:specific.innxCCgU! = [msg3, msg2, msg1]
asgi:specific.innxCCgU!$inflight = []
  1. Receive was cancelled and receive_lock was released:
Traceback (most recent call last):
  File "/pSeven/venv/lib/python3.7/site-packages/channels_redis/core.py", line 429, in receive
    real_channel
  File "/pSeven/venv/lib/python3.7/site-packages/channels_redis/core.py", line 484, in receive_single
    index, channel_key, timeout=self.brpop_timeout
  File "/pSeven/venv/lib/python3.7/site-packages/channels_redis/core.py", line 329, in _brpop_with_clean
    return await connection.brpoplpush(channel, backup_queue, timeout=timeout)
concurrent.futures._base.CancelledError
  1. The next call to receive for the channel specific.innxCCgU!EKHzfqiTYmHm acquired receive_lock and found redis lists in the following state:
asgi:specific.innxCCgU! = [msg3, msg2]
asgi:specific.innxCCgU!$inflight = [msg1]

It seems that the redis server already moved message to the backup list during brpoplpush operation but the message body was not fully received when the task was cancelled.
5. After core.py#L327 msg1 is placed to the beginning of the list asgi:specific.innxCCgU! and msg2 is returned by receive.
6. Next call to the receive will return msg3 and only the next will get msg1 from the redis.

The simplest way to fix it is to modify script at core.py#L316 and place messages from the backup list at the end of the channel list. But I also think that redis backup list cleanup must be done under the receive_lock because another call to receive may acquire lock and place message from the backup list back to the channel list before cleanup coroutine will be called.

Our environment setup (docker image based on ubuntu and daphne as a server):
python version: 3.7.1
installed packages:

aioredis                   1.2.0      
asgiref                    2.3.2      
channels                   2.1.7      
channels-redis             2.3.3      
daphne                     2.2.5      
Django                     2.1.7      
django-channels-jsonrpc    1.2.0      
@HMaker
Copy link

HMaker commented Apr 8, 2023

Does this bug still exists? Maybe channels needs tests which simulate cancelation of every network operation.

@sergey-komissarov
Copy link
Author

@HMaker I believe this bug is fixed. Last time I checked channels used priority queue based on message creation time instead of list. This must keep messages in order.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants