Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG?]memory accumulation causes oom #556

Open
stxrlong opened this issue Mar 30, 2024 · 2 comments
Open

[BUG?]memory accumulation causes oom #556

stxrlong opened this issue Mar 30, 2024 · 2 comments

Comments

@stxrlong
Copy link

stxrlong commented Mar 30, 2024

my program always has an OOM problem during a specific period of time. After a simple research, I found that the IO of write was very large during this period, which concurrency could reach 80k-100k QPS, and the memory usage increased, of course, the problem will restore after the async redis object was released.
I used the async feature of the redis++, which is a nice lib and very easy to use. Because of the above issue, I spent some time studying the source code of redis++ and hiredis. I didn't find the root cause of memory accumulation, but I found two points in
redis++, among which one point has greatly alleviated the memory stacking;
I used some memleak detection tools, such as valgrind/asan, but they all prevented the high concorrency

  1. we can release the formatted command after calling the hiredis async func where has no error occurred, because the hiredis has copied the value. this point does alleviate the memory stacking
void _handle(redisAsyncContext &ctx, HiredisAsyncCallback callback) {
        if (redisAsyncFormattedCommand(&ctx,
                    callback, this, _cmd.data(), _cmd.size()) != REDIS_OK) {
            throw_error(ctx.c, "failed to send command");
        }

        // release the cmd after calling the above func to reduce the peak usage of memory
        FormattedCommand cmd = std::move(_cmd);
}
  1. release the context when the hiredis do calling the disconnectcallback func
void AsyncConnection::_fail_events(std::exception_ptr err) {
    // we'd better release the context
    if(_ctx) {
        redisAsyncFree(_ctx);
        _ctx = nullptr;
    }

    _err = err;

    _state = State::BROKEN;

    // Must call _clean_up after `_err` has been set.
    _clean_up();
}

this is no longer a unique ptr, this point comes from _ctx = ctx.release();, therefore, the deletation struct is useless
However, when we call func _fail_events, hiredis may have already released the context by calling func redisAsyncDisconnect;

  1. I finally think the memory accumulation may caused by reconnect logic when redis++ actively discovers redis connection issues when sending commands
@stxrlong stxrlong changed the title memory accumulation causes oom [BUG?]memory accumulation causes oom Mar 31, 2024
@sewenew
Copy link
Owner

sewenew commented Apr 8, 2024

Thanks for your reporting!

I finally think the memory accumulation may caused by reconnect logic when redis++ actively discovers redis connection issues when sending commands

Did you have a network issue, or some special configuration, so that Redis connection was broken from time to time? If the network is healthy, do you still have the OOM problem?

Sorry for the late reply. Too busy these days...

Regards

@stxrlong
Copy link
Author

stxrlong commented Apr 13, 2024

u r so kind, yeah, that's right, i wrote a test case that runs above one thread using the command [zadd]. when the concurrency comes to 20k, the memory reaches 800M instantly. if there is no exception thrown, the memory will return to normal(about 14M). and if we receive exception, the memory will stay at 300M after we get all the responses, the approximate test case is as follows:

#include "sw/redis++/async_redis.h"
#include <thread>

int main(int argc, char**argv){
    int qps = 20000;
    std::string key = "sortedkey";

    sw::redis::AsyncRedis redis;
    ... // connnection
    std::thread t([qps, key, &redis]{
        std::atomic_int32_t res = 0, excpt = 0;
        std::string v;
        for(auto i=0; i<qps; ++i){
            v += std::to_string(i);
            redis.zadd(key, v, i).then(boost::lanuch::sync, [](auto&&fut){
                ++res;
                try{
                    fut.get()
                }catch(...){
                    ++excpt;
                }
            });
        }

        while(true){
            if(res== qps)
                break; 
            std::this_thread::sleep_for(std::chrono::seconds(1);
        }
    });

    ... // thread join
    ... // while true
}

i'm sorry about not find the root cause of this problem, i will study it again during the may day holiday

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants