Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sub/Pub socket high latency under low load #73

Open
Zhou-Ao opened this issue Oct 1, 2020 · 2 comments
Open

Sub/Pub socket high latency under low load #73

Zhou-Ao opened this issue Oct 1, 2020 · 2 comments

Comments

@Zhou-Ao
Copy link

Zhou-Ao commented Oct 1, 2020

I am testing Nanomsg NNG PUB/SUB socket performance with pynng .

In the test program below, SUB socket is used as a server to receive messages and PUB socket is used as a client to receive messages. The content of a message is the simply time when it is sent by the PUB client. The SUB server measures the latency by comparing between the time when the message is received time and the time (which is the message sent time) inside the message.

For 1,000 messages sent over ~1 second, I find two issues from the performance test:

  1. The latency is very high between PUB and SUB sockets in couple of milliseconds. However, under such low load, I would expect the latency to be in scale of microseconds.

  2. The latency performance improves if I make the PUB client sleep in a very short period of time. I understand that sleeping would reduce the load. But I would expect it makes not much difference given the low load. I am surprised that the latency can shoot up to 100ms if there is no sleep.

I am not sure if this issue relates to python GIL. Will my main PUB client thread compete against NNG internal I/O thread for GIL without a sleep?

Any ideas will be welcome and appreciated.

The testing programs are as below:

def run_nng_sub():
    import numpy as np
    import time, pickle, datetime, threading
    from pynng import Sub0, Timeout        

    address = 'tcp://127.0.0.1:31313'
    latencies = []
    with Sub0(listen=address, recv_timeout=100) as sub:
        sub.subscribe(b'')
        time.sleep(0.05)

        while  True:
            try:
                b_datetime = sub.recv()
                received_time = datetime.datetime.utcnow()
                
                sent_time = pickle.loads(b_datetime)
                latency = received_time - sent_time
                latencies.append(latency)
                print(f'latency: {latency}')
            except Timeout:
                if len(latencies) > 1000:
                    break

    print(f'latency:\nmax {np.max(latencies)}\naverage {np.mean(latencies)}\nmedian {np.median(latencies)}')


def run_nng_pub():
    import time, pickle, datetime
    from pynng import Pub0

    address = 'tcp://127.0.0.1:31313'
    with Pub0(dial=address) as pub:
        time.sleep(0.05)

        for i in range(1000):
            pub.send(pickle.dumps(datetime.datetime.utcnow()))
            # time.sleep(1e-5)

And the performance results on my PC are below:

No sleep
latency:
max 0:00:00.165002
average 0:00:00.025910
median 0:00:00.007000

time.sleep(1e-3)
latency:
max 0:00:00.045002
average 0:00:00.002442
median 0:00:00

time.sleep(1e-4)
latency:
max 0:00:00.051001
average 0:00:00.006631
median 0:00:00.001998

time.sleep(1e-5)
latency:
max 0:00:00.046003
average 0:00:00.008652
median 0:00:00.004000
@codypiersall
Copy link
Owner

(I just edited your comment to get syntax highlighting, didn't change content at all)

@codypiersall
Copy link
Owner

Thanks for opening the issue! It's very nice to have the sample program, too.

I'm not sure exactly what's going on, but I think it's some unfortunate interaction between pub/sub semantics and the GIL. With no sleep, the sending thread never gives up the GIL, and that means the receiving thread never has a chance to wake up and receive it. The receiving socket will then have a delayed message by the time the sending thread yields the GIL, after dropping a bunch of messages. On my computer, without a sleep, about 93% of the messages are dropped.

Whenever there is no sleep at all, the latency on my computer is even worse than on yours: a mean of 0.5 seconds, and max of 1.17 seconds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants