Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why do we test the official benchmark results showing no difference between iouring and nio? #106

Open
lwglgy opened this issue Jul 1, 2021 · 3 comments

Comments

@lwglgy
Copy link

lwglgy commented Jul 1, 2021

We used the official test program, but the results we got were almost no difference between iouring and nio. We use the official benchmark program, our test host has 32 cores, 200G RAM and the kernel version is 5.13.0

The following are our test codes.
`

import io.netty.bootstrap.ServerBootstrap;
import io.netty.channel.*;
import io.netty.channel.nio.NioEventLoopGroup;
import io.netty.channel.socket.SocketChannel;
import io.netty.channel.socket.nio.NioServerSocketChannel;

public class EchoNioServer {
private static final int PORT = Integer.parseInt(System.getProperty("port", "8088"));

public static void main(String []args) {
    System.out.println("start Nio server");
    EventLoopGroup group = new NioEventLoopGroup();
    final EchoServerHandler serverHandler = new EchoServerHandler();
    //boss用来接收进来的连接
    EventLoopGroup bossGroup = new NioEventLoopGroup();
    //用来处理已经被接收的连接;
    EventLoopGroup workerGroup = new NioEventLoopGroup();
    try {
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .option(ChannelOption.SO_REUSEADDR, true)
                .channel(NioServerSocketChannel.class)
                .childHandler(new ChannelInitializer<SocketChannel>() {
                    @Override
                    public void initChannel(SocketChannel ch) throws Exception {
                        ChannelPipeline p = ch.pipeline();
                        //p.addLast(new LoggingHandler(LogLevel.INFO));
                        p.addLast(serverHandler);
                    }
                });

        // Start the server.
        ChannelFuture f = b.bind(PORT).sync();

        // Wait until the server socket is closed.
        f.channel().closeFuture().sync();
    } catch (InterruptedException e) {
        e.printStackTrace();
    } finally {
        // Shut down all event loops to terminate all threads.
        group.shutdownGracefully();
        workerGroup.shutdownGracefully();
        bossGroup.shutdownGracefully();
    }
}

}

import io.netty.bootstrap.ServerBootstrap;
import io.netty.channel.*;
import io.netty.channel.socket.SocketChannel;
import io.netty.incubator.channel.uring.IOUringEventLoopGroup;
import io.netty.incubator.channel.uring.IOUringServerSocketChannel;

// This is using io_uring
public class EchoIOUringServer {
private static final int PORT = Integer.parseInt(System.getProperty("port", "8081"));

public static void main(String []args) {
    System.out.println("start iouring server");
    EventLoopGroup group = new IOUringEventLoopGroup();
    final EchoServerHandler serverHandler = new EchoServerHandler();
    //boss用来接收进来的连接
    EventLoopGroup bossGroup = new IOUringEventLoopGroup();
    //用来处理已经被接收的连接;
    EventLoopGroup workerGroup = new IOUringEventLoopGroup();

    try {
        ServerBootstrap b = new ServerBootstrap();
        b.group(bossGroup, workerGroup)
                .option(ChannelOption.SO_REUSEADDR, true)
                .channel(IOUringServerSocketChannel.class)
                .childHandler(new ChannelInitializer<SocketChannel>() {
                    @Override
                    public void initChannel(SocketChannel ch) throws Exception {
                        ChannelPipeline p = ch.pipeline();
                        //p.addLast(new LoggingHandler(LogLevel.INFO));
                        p.addLast(serverHandler);
                    }
                });

        // Start the server.
        ChannelFuture f = b.bind(PORT).sync();

        // Wait until the server socket is closed.
        f.channel().closeFuture().sync();
    } catch (InterruptedException e) {
        e.printStackTrace();
    } finally {
        // Shut down all event loops to terminate all threads.
        group.shutdownGracefully();
        workerGroup.shutdownGracefully();
        bossGroup.shutdownGracefully();
    }
}

}

import io.netty.channel.ChannelHandler;
import io.netty.channel.ChannelHandlerContext;
import io.netty.channel.ChannelInboundHandlerAdapter;

@ChannelHandler.Sharable
public class EchoServerHandler extends ChannelInboundHandlerAdapter {

@Override
public void channelRead(ChannelHandlerContext ctx, Object msg) {
    ctx.write(msg);
}

@Override
public void channelReadComplete(ChannelHandlerContext ctx) {
    ctx.flush();
}

@Override
public void exceptionCaught(ChannelHandlerContext ctx, Throwable cause) {
    // Close the connection when an exception is raised.
    ctx.close();
}

@Override
public void channelWritabilityChanged(ChannelHandlerContext ctx) throws Exception {
    // Ensure we are not writing to fast by stop reading if we can not flush out data fast enough.
    if (ctx.channel().isWritable()) {
        ctx.channel().config().setAutoRead(true);
    } else {
        ctx.flush();
        if (!ctx.channel().isWritable()) {
            ctx.channel().config().setAutoRead(false);
        }
    }
}

}

`

Above is our test procedure, which is from the benchmark on the official website, but there is almost no difference in the results we measured.

image

@normanmaurer
Copy link
Member

Is this on a real machine or on a vm ? Also can you just use new IOUringEventLoopGroup(1) and new NioEventLoopGroup(1) ?

@HowHsu
Copy link

HowHsu commented Jul 8, 2021

Hi Norman,
We tried again with *EventLoopGroup(1), it works! io_uring is about 5 times faster
than nio and epoll in terms of packet rate. But meanwhile we found that the io_uring
server costs 1300% cpu while nio/epoll costs just 100%. We then found it's caused
by io-wq threads. So I used a very big ioSqeAsyncThreshold like 50000000 to do the test again, this time io_uring cost just 100% too, the packet rate is similar with (even a little bit worse than)
the nio/epoll one

@franz1981
Copy link
Contributor

Given that IOSQE_ASYNC can have a very unexpected behaviour eg issueing an unbounded number of threads to async complete requests (if not in some recent-ish kernel patch IIRC) maybe would be better to let the number of handled fds before using IOSQE_ASYNC unbounded by default, limiting only if necessary, wdyt @normanmaurer ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants