-
Notifications
You must be signed in to change notification settings - Fork 702
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Repository.ResolveRevision is very slow #824
Comments
@pgeorgi The opening of the repository may be improved by the changes included in #799. Would you be able to test that please? Now for ref/revision resolution,
Depending on how many operations you are doing, just by resolving to a full hash first may provide some gains (as per above). Similar tests using Code used for the checks above: package main
import (
"encoding/hex"
"fmt"
"os"
"path/filepath"
"time"
"github.com/go-git/go-billy/v5/osfs"
"github.com/go-git/go-git/v5"
. "github.com/go-git/go-git/v5/_examples"
"github.com/go-git/go-git/v5/plumbing"
"github.com/go-git/go-git/v5/plumbing/cache"
"github.com/go-git/go-git/v5/storage/filesystem"
)
func main() {
url := "https://github.com/go-git/go-git.git"
directory, _ := os.MkdirTemp("", "gogit-test")
defer os.RemoveAll(directory)
Info("git clone %s %s", url, directory)
fs := osfs.New(directory)
dot := osfs.New(filepath.Join(directory, git.GitDirName))
storer := filesystem.NewStorage(dot, cache.NewObjectLRUDefault())
r, err := git.Clone(storer, fs, &git.CloneOptions{
URL: url,
})
CheckIfError(err)
fullHash := "56c4bf4ca9789505db7a6eefb910a7259c1fcb79"
ref := "56c4bf"
t("get commit from full hash", func() string {
h := plumbing.NewHash(fullHash)
commit, err := r.CommitObject(h)
CheckIfError(err)
return commit.Hash.String()
})
t("get commit from partial hash", func() string {
prefix, err := hex.DecodeString(ref)
CheckIfError(err)
hs, err := storer.HashesWithPrefix(prefix)
CheckIfError(err)
commit, err := r.CommitObject(hs[0])
CheckIfError(err)
return commit.Hash.String()
})
t("get commit from revision", func() string {
rev, err := r.ResolveRevision(plumbing.Revision(ref))
CheckIfError(err)
return rev.String()
})
}
func t(msg string, f func() string) {
t1 := time.Now()
h := f()
t2 := time.Now()
Info(fmt.Sprintf("%s: in %s", msg, t2.Sub(t1)))
fmt.Println(h)
} |
I'm using go-git in a tool that parses two git repos, one containing test results referencing the other. Those references use commit hashes, sometimes in the short format, which finally works in go-git 5.8 (thanks!)
Before 5.8, I worked around some of the problems by calling out into the git executable, and I was thrilled to remove that environmental dependency. See https://review.coreboot.org/c/coreboot/+/68959/3 for the changes I made.
I expected speed to remain approximately the same, or maybe even improve a little because it's one process managing the git repo, not thousands of processes, each starting from scratch to parse the data.
Sadly,
ResolveRevision
is very slow in my scenario, increasing runtime from 4 seconds to 92 seconds.I have a reproducer that is somewhat large (pulling two repos) but it's 100% what I was doing, so I'll start by presenting it here. I can look into creating a smaller test case if necessary.
I also added some cpu profiling code, closely following the cpuprofile example in https://go.dev/blog/pprof and ran that to see what is going on. That change can be found at https://review.coreboot.org/c/coreboot/+/76924
Running
pprof
on cpuprofile output shows:If I understand that data correctly, the approach of creating new objects for every object in a pack file in
go-git/plumbing/format/idxfile/idxfile.go
Line 290 in 36477e8
runtime.newobject
andruntime.mallocgc
.Anything I could improve in my code to avoid this behavior? Anything that needs to be done in go-git? Or does the go compiler need smarter handling for that pattern? For reference, I'm using
go version go1.20.7 linux/amd64
The text was updated successfully, but these errors were encountered: