Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Git.readBlob is too slow. #1841

Open
zFitness opened this issue Nov 10, 2023 · 9 comments
Open

Git.readBlob is too slow. #1841

zFitness opened this issue Nov 10, 2023 · 9 comments

Comments

@zFitness
Copy link

💬 Git.readBlob is too slow.

I have a program that requires a lot of cross-branch reading and writing files.
I haI ran a test using nodegit and isomorphic-git. FileList: 613

  test("isomorphic-git", async () => {
    const oid = await git1.resolveRef({
      fs: fs,
      dir: projectDir,
      ref: feature,
    });
    let i = 0;
    console.time("isomorphic-git");
    for (let file of fileList) {
      await git1.readBlob({
        fs: fs,
        dir: projectDir,
        oid,
        filepath: file.replace(projectDir, ""),
      });
      i++;
    }
    console.timeEnd("isomorphic-git");
    console.log(i);
  }, 1111111);

  test("node-git", async () => {
    const c = await nodegit.getCommit(feature);
    const gitfs = await nodegit.getFs(c);
    let i = 0;
    console.time("node-git");
    for (let file of fileList) {
      await gitfs.readFile(file);
      i++;
    }
    console.timeEnd("node-git");
    console.log(i);
  }, 1111111);

result:

console.time
    isomorphic-git: 13424 ms
console.time
    node-git: 132 ms

Although node-git is very fast here, it is native module.
So, I want to know how to speed up.
Thinks! This is a very awesome job!

@zFitness
Copy link
Author

    ✓ isomorphic-git (13353 ms)
    ✓ node-git (132 ms)
    ✓ fs (51 ms)
    ✓ child_process (11518 ms)

@zFitness
Copy link
Author

Seems to be slower than child_process + git.

@jcubic
Copy link
Contributor

jcubic commented Nov 10, 2023

I don't think that we can do anything to make it faster. This is the limitation of NodeJS and FS. NodeGit uses libgit library that is written in C, that's why it's probably faster.

But if you want to contribute and increase the performance of the library you're welcome to do so.

@zFitness
Copy link
Author

I don't think that we can do anything to make it faster. This is the limitation of NodeJS and FS. NodeGit uses libgit library that is written in C, that's why it's probably faster.

我认为我们无法采取任何措施来加快速度。这是 NodeJS 和 FS 的限制。 NodeGit 使用用 C 编写的 libgit 库,这就是它可能更快的原因。关闭

But if you want to contribute and increase the performance of the library you're welcome to do so.

ok, thanks

@zFitness
Copy link
Author

zFitness commented Nov 11, 2023

I don't think that we can do anything to make it faster. This is the limitation of NodeJS and FS. NodeGit uses libgit library that is written in C, that's why it's probably faster.

But if you want to contribute and increase the performance of the library you're welcome to do so.

Hello, I found that the cache parameter can improve the speed

await git1.readBlob({
        fs: fs,
        dir: projectDir,
        oid,
        cache,
        filepath: file.replace(projectDir, ""),
      });

Now, I implemented an exists function to determine whether files and directories exist under a branch.
Please optimize it for me.Thanks @jcubic

      exists: async (path: string) => {
        try {
          await git.readBlob({
            fs: fs,
            dir: this.gitBasePath,
            oid,
            filepath: path.replace(this.gitBasePath, "").replace(/^\//, ""),
          });
          return true;
        } catch (e) {
          try {
            await git.readTree({
              fs: fs,
              dir: this.gitBasePath,
              oid,
              filepath: path.replace(this.gitBasePath, "").replace(/^\//, ""),
            });
            return true;
          } catch (e) {
            return false;
          }
        }
      },

@scolladon
Copy link
Contributor

Hi @zFitness

I also experienced slow performance with huge repository.
After profiling the lib with my implementation it seemed the issue was located inside a small function massively called.
The fix has been integrated in v1.25.3.

Could you retry and see if it also fixed your use case ?

@zFitness
Copy link
Author

Hi @zFitness

I also experienced slow performance with huge repository. After profiling the lib with my implementation it seemed the issue was located inside a small function massively called. The fix has been integrated in v1.25.3.

Could you retry and see if it also fixed your use case ?您可以重试并看看它是否也修复了您的用例吗?关闭

ok, i will retry

@zFitness
Copy link
Author

Hi @zFitness

I also experienced slow performance with huge repository. After profiling the lib with my implementation it seemed the issue was located inside a small function massively called. The fix has been integrated in v1.25.3.

Could you retry and see if it also fixed your use case ?您可以重试并看看它是否也修复了您的用例吗?关闭

Maybe it's a little fast, but it is faster to use the cache parameter

    ✓ isomorphic-git-cache (835 ms)
    ✓ isomorphic-git (12870 ms)

@scolladon
Copy link
Contributor

Nice, it probably means your use case was not dependent to the fix.
Cache works pretty well for you !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants