Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable TraceOn will cause OOM #1771

Open
mingyang91 opened this issue Feb 12, 2023 · 5 comments
Open

Enable TraceOn will cause OOM #1771

mingyang91 opened this issue Feb 12, 2023 · 5 comments
Assignees

Comments

@mingyang91
Copy link

mingyang91 commented Feb 12, 2023

Re-produce:

package main

import (
	"context"
	"github.com/minio/minio-go/v7"
	"github.com/minio/minio-go/v7/pkg/credentials"
	log "github.com/sirupsen/logrus"
	"net/http"
	_ "net/http/pprof"
	"os"
	"time"
)

func main() {
	var cliOpts = minio.Options{Creds: credentials.NewEnvMinio(), Secure: false}
	cli, err := minio.New("127.0.0.1:9000", &cliOpts)
	if err != nil {
		log.Errorf("connection error %v", err)
		return
	}
	cli.TraceOn(os.Stdout)
	opts := minio.PutObjectOptions{DisableMultipart: true}
	info, err := cli.FPutObject(context.Background(), "oom-poc", "random.bin", "/.../random.bin", opts)
	if err != nil {
		log.Errorf("put error %v", err)
		return
	}
	log.Infof("put success %v", info)
	log.Println(http.ListenAndServe("localhost:6060", nil))
	time.Sleep(1 * time.Hour)
	return
}

If I turn on TraceOn(...), it can consume a lot of memory, resulting in an OOM error, when I PutObject with a large file. I think this is because when we use the DumpRequestOut function, it loads the entire request body into memory, even though only the HTTP header is needed. Check out this line of code in the dump.go file on:
https://github.com/golang/go/blob/master/src/net/http/httputil/dump.go#L157-L167

Do you think we could use a more stream-based approach instead of the heavy-duty DumpRequestOut to avoid the memory consumption peak?

@mingyang91
Copy link
Author

Even if this is just for debugging purposes, as a widely used public library, maybe we could write a simple function that just prints the meta data of the HTTP request, instead of the whole enchilada?

@harshavardhana
Copy link
Member

We do not read the body at all @mingyang91

minio-go/api.go

Line 451 in 230e538

reqTrace, err := httputil.DumpRequestOut(req, false)

The issue is because of something else here. what is the size of the file

	info, err := cli.FPutObject(context.Background(), "oom-poc", "random.bin", "/.../random.bin", opts)

and how much memory do you have? can you share the profile?

@mingyang91
Copy link
Author

mingyang91 commented Feb 13, 2023

This is a massive test file created with head -c 4294967296 </dev/urandom > random.bin. After executing this line, the program ends up hogging over 10GB of memory.

I found out that the issue stems from DumpRequestOut which employs http.Transport and RoundTrip to run a full mock of the HTTP write process. According to L157-L161, it allocates memory exceeding req.ContentLength and loads the entire body into dump, even though the second parameter body is set to false, meaning it only captures the HTTP header and discards the body.

@mingyang91
Copy link
Author

I came across a similar issue on aws/aws-sdk-go-v2#1618, and @prembhaskal submitted a fix and pull request to the standard library (golang/go#51662).
But, the fix hasn't been merged yet.

@shtripat
Copy link
Contributor

shtripat commented Jul 4, 2023

With latest MinIO and minio-go I was able to load a 4GiB object successfully to the bucket. If TraceOn enabled, CPU usage goes to ~30% while PUT /oom-poc/random.bin HTTP/1.1 call but put call doesn't fail. Without TraceOn set it works as expected without any high CPU usage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants