Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove runtime.SetFinalizer #905

Closed
wants to merge 1 commit into from
Closed

remove runtime.SetFinalizer #905

wants to merge 1 commit into from

Conversation

shell-skrimp
Copy link

@shell-skrimp shell-skrimp commented Jul 20, 2023

I believe this fixes #904

I did exact same tests in that issue (more actually) and memory usage is ~200MB, down from 1100MB.

I did some research on runtime.SetFinalizer and it should not be used for "large" objects, has no guarantees on being called, and if not done properly will not free memory properly, and is considered a "hack" for code that doesnt cleanup properly (closing connections, etc). It also has a significant performance hit when the GC runs, and tells the GC to keep track of the code, but that doesnt really make sense when there are destroy/shutdown/close functions right?

In the bug that I reported I had only done about 30 concurrent copies of a 2Gi image, this resulted in 1.1G of RSS; memory was never freed. I have done 90+ concurrent copies of a 2Gi image and memory usage stays around 210M RSS; so I believe this PR fixes.

Copy link
Collaborator

@phlogistonjohn phlogistonjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did some research on runtime.SetFinalizer and it should not be used for "large" objects,

I have never heard this before. References please.

has no guarantees on being called,

Agreed. However, it is a best effort attempt to reclaim C memory in some cases

and if not done properly will not free memory properly,

Probably true but I need more information on what is or isn't proper in the context of your issue.

and is considered a "hack" for code that doesnt cleanup properly (closing connections, etc).

Repeating myself, this is generally true but we have it as a just-in-case mechanism for freeing memory allocated by the C/C++ based code that go-ceph calls into.

In the bug that I reported I had only done about 30 concurrent copies of a 2Gi image, this resulted in 1.1G of RSS; memory was never freed. I have done 90+ concurrent copies of a 2Gi image and memory usage stays around 210M RSS; so I believe this PR fixes.

I'm extremely puzzled as to why a memory cleanup mechanism would cause your problem when it should only be triggered in the cases where other objects otherwise don't get freed.

I appreciate your effort to try and help improve go-ceph, but I think I need a little convincing this is the correct change. First off, you've eliminated a lot of runtime.SetFinalizer calls in a single patch. It would be nice to try and narrow things down to exactly which calls might be causing your problem. Furthermore, it will be much more convincing to show some sort of heap profiling before and after the changes showing exactly what kind of leaks occur when the finalizer is in place. With data, we may find we can improve our use of finalizer rather than totally eliminating it.

Comment on lines -74 to -76
if err := c.ensureConnected(); err != nil {
return
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this change have to do with the rest of the patch?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// freeConn releases resources that are allocated while configuring the
// connection to the cluster. rados_shutdown() should only be needed after a
// successful call to rados_connect(), however if the connection has been
// configured with non-default parameters, some of the parameters may be
// allocated before connecting. rados_shutdown() will free the allocated
// resources, even if there has not been a connection yet.

Doing the check will result in a memory leak if there is no connection based on what it says in the src documentation.

@shell-skrimp
Copy link
Author

I have never heard this before. References please.

https://lemire.me/blog/2023/05/19/the-absurd-cost-of-finalizers-in-go/, https://utcc.utoronto.ca/~cks/space/blog/programming/GoFinalizerCostsNotes

Agreed. However, it is a best effort attempt to reclaim C memory in some cases

why not just communicate that end user should close/destroy/shutdown where proper?

Repeating myself, this is generally true but we have it as a just-in-case mechanism for freeing memory allocated by the C/C++ based code that go-ceph calls into.

yea but it doesnt work, and its unreliable:

Finalizers are run in dependency order

If a cyclic structure includes a block with a finalizer, that cycle is not guaranteed to be garbage collected and the finalizer is not guaranteed to run, because there is no ordering that respects the dependencies.

The finalizer is scheduled to run at some arbitrary time after the program can no longer reach the object to which obj points. There is no guarantee that finalizers will run before a program exits, so typically they are useful only for releasing non-memory resources associated with an object during a long-running program.

It is not guaranteed that a finalizer will run if the size of *obj is zero bytes

It is not guaranteed that a finalizer will run for objects allocated in initializers for package-level variables

Note that because finalizers may execute arbitrarily far into the future after an object is no longer referenced, the runtime is allowed to perform a space-saving optimization that batches objects together in a single allocation slot. The finalizer for an unreferenced object in such an allocation may never run if it always exists in the same batch as a referenced object

A single goroutine runs all finalizers for a program, sequentially

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential memory leak on rbd image copy
2 participants