Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

plumbing: Optimise memory consumption for filesystem storage #799

Merged
merged 2 commits into from
Nov 6, 2023

Commits on Oct 28, 2023

  1. *: Improve BenchmarkPlainClone

    The changes aim to make that specific benchmark more
    reliable for setting a baseline which can later be
    use to compare against future changes on the most
    basic feature of go-git: plain cloning repositories.
    pjbgf committed Oct 28, 2023
    Configuration menu
    Copy the full SHA
    814abc0 View commit details
    Browse the repository at this point in the history
  2. plumbing: Optimise memory consumption for filesystem storage

    Previously, as part of building the index representation, the resolveObject
    func would create an interim plumbing.MemoryObject, which would then be
    saved into storage via storage.SetEncodedObject. This meant that objects
    would be unnecessarily loaded into memory, to then be saved into disk.
    
    The changes streamlines this process by:
    - Introducing the LazyObjectWriter interface which enables the write
      operation to take places directly against the filesystem-based storage.
    - Leverage multi-writers to process the input data once, while targeting
      multiple writers (e.g. hasher and storage).
    
    An additional change relates to the caching of object info children within
    Parser.get. The cache is now skipped when a seekable filesystem is being
    used.
    
    The impact of the changes can be observed when using seekable filesystem
    storages, especially when cloning large repositories.
    
    The stats below were captured by adapting the BenchmarkPlainClone test
    to clone https://github.com/torvalds/linux.git:
    
    pkg: github.com/go-git/go-git/v5
    cpu: Intel(R) Core(TM) i9-10885H CPU @ 2.40GHz
                  │  /tmp/old   │             /tmp/new              │
                  │   sec/op    │   sec/op    vs base               │
    PlainClone-16   41.68 ± 17%   48.04 ± 9%  +15.27% (p=0.015 n=6)
    
                  │   /tmp/old    │               /tmp/new               │
                  │     B/op      │     B/op       vs base               │
    PlainClone-16   1127.8Mi ± 7%   256.7Mi ± 50%  -77.23% (p=0.002 n=6)
    
                  │  /tmp/old   │              /tmp/new              │
                  │  allocs/op  │  allocs/op   vs base               │
    PlainClone-16   3.125M ± 0%   3.800M ± 0%  +21.60% (p=0.002 n=6)
    
    Notice that on average the memory consumption per operation is over 75%
    smaller. The time per operation increased by 15%, which may actual be less
    on long running applications, due to the decreased GC pressure and the
    garbage collection costs.
    
    Signed-off-by: Paulo Gomes <pjbgf@linux.com>
    pjbgf committed Oct 28, 2023
    Configuration menu
    Copy the full SHA
    1c361ad View commit details
    Browse the repository at this point in the history