Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak in Ruby gem as of 3.15.x #8421

Closed
splittingred opened this issue Mar 22, 2021 · 11 comments · Fixed by #8429
Closed

Memory Leak in Ruby gem as of 3.15.x #8421

splittingred opened this issue Mar 22, 2021 · 11 comments · Fixed by #8429
Assignees
Labels

Comments

@splittingred
Copy link

splittingred commented Mar 22, 2021

What version of protobuf and what language are you using?
Version: 3.15.6
Language: Ruby

What operating system (Linux, Windows, ...) and version?

MacOS Catalina, Linux (Debian Buster).

What runtime / compiler are you using (e.g., python version or gcc version)

N/A

What did you do?

We have noticed a memory leak in the google-protobuf gem as of 3.15, when used in a gRPC server. Rolling back to 3.14 fixes the issue. See an example graph:

Screen Shot 2021-03-22 at 5 11 58 PM

The 2nd section there shows when google-protobuf 3.15 was deployed; there is a small section after showing normalization of memory after reverting back to 3.14.

In order to eliminate potential other libraries, I wrote a quick benchmarking test doing a small client->server unary call. This runs a benchmark using memory-benchmark to run against Protobuf 3.14 and Protobuf 3.15. The results showed a significant number of objects being retained (and increased memory usage) when run with 1,000 requests on protobuf 3.15, but no such leakage on protobuf 3.14:

$ ./run.sh
Running Protobuf 3.14 test
------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using benchmark-memory 0.1.2
Using bundler 1.17.3
Using google-protobuf 3.14.0 (universal-darwin)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (universal-darwin)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client     3.976M memsize (     0.000  retained)
                        30.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Calculating -------------------------------------
              client     3.976M memsize (     0.000  retained)
                        30.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Running Protobuf 3.15 test
------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using benchmark-memory 0.1.2
Using bundler 1.17.3
Using google-protobuf 3.15.6 (universal-darwin)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (universal-darwin)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client     4.616M memsize (   160.040k retained)
                        46.001k objects (     4.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client     4.605M memsize (   155.480k retained)
                        45.716k objects (     3.887k retained)
                         6.000  strings (     0.000  retained)
Benchmarks successful! Shutting down server...
Server shutdown, benchmark finished successfully.

You can run this test and see the code for it here: https://github.com/splittingred/protobuf-315-leak

What did you expect to see

No memory leak or runaway usage.

Anything else we should know about your project / environment

This occurs both on mac os x, and also in a dockerized Debian Buster container environment.

@splittingred splittingred changed the title Memory Leak in Ruby 3.15.x Memory Leak in Ruby gem as of 3.15.x Mar 22, 2021
@haberman haberman self-assigned this Mar 23, 2021
@haberman haberman added the ruby label Mar 23, 2021
@haberman
Copy link
Member

Thanks for the report. Can you confirm which version of Ruby you are using?

@splittingred
Copy link
Author

Hi @haberman ! This happened on 2.6.6 in Debian Buster; however, I can confirm it also happens on 2.4.x as well:

➜ rvm current
ruby-2.4.6
➜  ./run.sh
Running Protobuf 3.14 test
------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using benchmark-memory 0.1.2
Using bundler 1.17.3
Using google-protobuf 3.14.0 (universal-darwin)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (universal-darwin)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client     3.224M memsize (     0.000  retained)
                        30.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Calculating -------------------------------------
              client     3.224M memsize (     0.000  retained)
                        30.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Running Protobuf 3.15 test
------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using benchmark-memory 0.1.2
Using bundler 1.17.3
Fetching google-protobuf 3.15.6 (universal-darwin)
Installing google-protobuf 3.15.6 (universal-darwin)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (universal-darwin)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client     3.864M memsize (   160.040k retained)
                        46.001k objects (     4.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client     3.840M memsize (   150.600k retained)
                        45.411k objects (     3.765k retained)
                         6.000  strings (     0.000  retained)
Benchmarks successful! Shutting down server...
Server shutdown, benchmark finished successfully.

But not Ruby 2.7.0:

➜ rvm current
ruby-2.7.0
➜ ./run.sh
Running Protobuf 3.14 test
------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using benchmark-memory 0.1.2
Using bundler 1.17.3
Using google-protobuf 3.14.0 (universal-darwin)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (universal-darwin)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client     2.272M memsize (     0.000  retained)
                        29.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Calculating -------------------------------------
              client     2.272M memsize (     0.000  retained)
                        29.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Running Protobuf 3.15 test
------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using benchmark-memory 0.1.2
Using bundler 1.17.3
Using google-protobuf 3.15.6 (universal-darwin)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (universal-darwin)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client     2.432M memsize (     0.000  retained)
                        33.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Calculating -------------------------------------
              client     2.432M memsize (     0.000  retained)
                        33.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Benchmarks successful! Shutting down server...
Server shutdown, benchmark finished successfully.

@splittingred
Copy link
Author

May have spoken a bit too soon. I've added CircleCI build results for 2.4->2.7 on the repo I listed above that runs the tests:

https://app.circleci.com/pipelines/github/splittingred/protobuf-315-leak?branch=main

You can see the individual results for each Ruby version there. The impact in 2.7 is much less; however, it is still there.

@haberman
Copy link
Member

Thanks, knowing that 2.7 is much better will help narrow this down a lot.

@haberman
Copy link
Member

haberman commented Apr 3, 2021

Fix is released in https://github.com/protocolbuffers/protobuf/releases/tag/v3.15.7.

@splittingred
Copy link
Author

Hi @haberman - thanks for the fix. This still seems to be happening on 3.15.7, however:

https://app.circleci.com/pipelines/github/splittingred/protobuf-315-leak/6/workflows/3d8ebf9f-0c4c-4d66-8b10-11937d856b97/jobs/25

On Ruby version: ruby 2.6.7p197 (2021-04-05 revision 67941) [x86_64-linux]
------------------------------------
Running Protobuf 3.14 test
------------------------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using bundler 1.17.3
Fetching memory_profiler 0.9.14
Fetching google-protobuf 3.14.0 (x86_64-linux)
Installing memory_profiler 0.9.14
Fetching benchmark-memory 0.1.2
Installing benchmark-memory 0.1.2
Installing google-protobuf 3.14.0 (x86_64-linux)
Fetching googleapis-common-protos-types 1.0.6
Installing googleapis-common-protos-types 1.0.6
Fetching grpc 1.36.0 (x86_64-linux)
Installing grpc 1.36.0 (x86_64-linux)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client    39.760M memsize (     0.000  retained)
                       300.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
Calculating -------------------------------------
              client    39.760M memsize (     0.000  retained)
                       300.000k objects (     0.000  retained)
                         4.000  strings (     0.000  retained)
------------------------------------
Running Protobuf 3.15.6 test
------------------------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using bundler 1.17.3
Using benchmark-memory 0.1.2
Fetching google-protobuf 3.15.7 (x86_64-linux)
Installing google-protobuf 3.15.7 (x86_64-linux)
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (x86_64-linux)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)
------------------------------------
Running Protobuf 3.15.7 test
------------------------------------
Installing gems...
Fetching gem metadata from https://rubygems.org/.........
Resolving dependencies...
Using memory_profiler 0.9.14
Using bundler 1.17.3
Using google-protobuf 3.15.7 (x86_64-linux)
Using benchmark-memory 0.1.2
Using googleapis-common-protos-types 1.0.6
Using grpc 1.36.0 (x86_64-linux)
Bundle updated!
Starting grpc server...
Letting server boot...
Beginning benchmark...
Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)
Benchmarks successful! Shutting down server...
Server shutdown, benchmark finished successfully.

I've updated https://github.com/splittingred/protobuf-315-leak to test against 3.14, 3.15.6, and 3.15.7.

@haberman
Copy link
Member

haberman commented Apr 7, 2021

Hi @splittingred, thanks for the update and sorry this hasn't been fully resolved yet.

I have some good news: I found a far more straightforward memory leak that has been present since 3.15. The issue affects all versions of Ruby equally, and the fix is very simple. We should release this fix very soon.

However I want to offer a slight bit of caution: even this fix will not take your benchmark to "0 retained" for Ruby <2.7. Versions of Ruby prior to 2.7 have a deficient implementation of ObjectSpace::WeakMap that makes it impossible for us to keep 0 objects retained. Even with the fix, I expect your test case to show some amount of memory retained.

The extra retained memory should not grow without bound though. In particular, the graph you posted in your original message, where the memory grows without bound, should no longer happen.

@bruno-
Copy link

bruno- commented Apr 8, 2021

Fix for this was merged via #8461 but for some reason github did not auto-close this issue.

@danmayer
Copy link

danmayer commented Apr 8, 2021

This is hitting a large number of our apps as well, currently doing restarts every 9hrs to keep from OOM. I see the fix is in and there is an RC release. Would you all recommend trying the RC release or will a main line be cut soon?

@haberman
Copy link
Member

haberman commented Apr 8, 2021

@danmayer 3.15.8 just hit Ruby Gems, which contains the fix: https://rubygems.org/gems/google-protobuf/versions/3.15.8-x86_64-linux

@splittingred
Copy link
Author

splittingred commented Apr 15, 2021

Hi @haberman - just to follow up here, I ran this against our benchmark, and it was essentially unchanged from 3.15.7 to 3.15.8.

Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)

Furthermore, these numbers scale linearly with request counts:


10,000 Requests:
Calculating -------------------------------------
              client    46.160M memsize (     1.600M retained)
                       460.000k objects (    40.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client    46.158M memsize (     1.599M retained)
                       459.950k objects (    39.981k retained)
                         6.000  strings (     0.000  retained)
20,000 Requests:
Calculating -------------------------------------
              client    92.320M memsize (     3.200M retained)
                       920.000k objects (    80.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client    92.320M memsize (     3.200M retained)
                       920.000k objects (    80.001k retained)
                         6.000  strings (     0.000  retained)
40,000 Requests:
Calculating -------------------------------------
              client   184.640M memsize (     6.400M retained)
                         1.840M objects (   160.001k retained)
                         6.000  strings (     0.000  retained)
Calculating -------------------------------------
              client   184.640M memsize (     6.400M retained)
                         1.840M objects (   160.001k retained)
                         6.000  strings (     0.000  retained)

You can see the results here: https://app.circleci.com/pipelines/github/splittingred/protobuf-315-leak

I've also updated the repository I used for testing here: https://github.com/splittingred/protobuf-315-leak

Would you expect that to be the case? I'm a bit concerned about putting protobuf 3.15.x on live code to see how it performs in the wild given the linear memory increase here; and all my internal load testing shows the 1:1 scale.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants