New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement an Elisp binding for libgit2 #2959
Comments
@tarsius If you have any |
That's awesome to hear! I suspect you also have some experience with the new module support. I intend to get started with this soon, but could definitely need some help. |
I haven't yet looked into the module support, but this would be a great way to learn it. Count me in! |
@jwiegley I'm not current on emacs-devel, but have we thought at all about how modules will be distributed? Is package.el planning support for them? I ask since they impact the structure of the implementation and I've been thinking about starting that project up again personally (now that magithub is stable-ish). |
I would love to see some data to back up that claim. It sounds right to me, but it would be a shame for you to spend precious time on an optimization that might not bear fruit as anticipated. Perhaps we could start with just a minimal implementation of the libgit2 bindings and do some benchmarks to prove the concept. I'm guessing you've already thought this through... But it would be nice to see the data. I can help out if there's any dividing and conquering that can be done. |
This is a good question, and maybe we'll be the first ones to answer it for future package authors too. I just don't know yet. :) |
@mgalgs If someone can show me a set of commands that are being used by magit, and which are presumed to be slow, I can tell you how libgit2 might affect the performance there and why. It's possible too that we could use more caching, and more low-level Git commands (for example, direct tree manipulation) to defer going the libgit2 route. |
... because we call it a lot. So it would be a good idea to implement support for that first.
The problem isn't that certain git commands are slow on Windows, but that starting subprocesses is slow per se and Magit starts many.
We already do a lot of caching. Identical calls (same arguments and directory) during a single refresh (i.e. after every Magit command) get the value from a cache. I don't think there is much room for improvement here. Well there is--see #2982--but that goes much further than just a stupid cache.
There have been some reports that e.g. rebasing can be slow on Windows, I think. But we cannot do much about that--I certainly don't want to reimplement every Git command that is still implemented as a shell script. However a few months ago a similar (but much less severe) instance of "starting a subprocess is slow" was fixed, but only on macOS/Darwin. I am hoping that something similar can be done on Windows. Unfortunately I never got around asking the right people for help. We should dig up the old discussions and then bring those to their attention. The issue on macOS was that "the wrong fork" was being used and since that was being done for a very long time on macOS, the same thing might very well be true on Windows also. But even if that gives us (not just Magit, but any package that uses many subprocesses) an amazing performance boost, I would still like to be able to use libgit from elisp. |
For anyone who's curious, I've implemented a type of benchmark in this gist. What I've done is I've redirected |
More useful metrics would have to correlate these data with the timestamps of actual magit commands, but here are my own numbers (using the approach and code above) after using magit to review some history. First column is number of calls, second column is total time taken by that command.
|
Be careful with libgit2 as I don't think it implements file locks in a compatible way with Git itself. For example if |
Based on issues like libgit2/libgit2#2902, I think the developers both think about these sorts of issues, and would be open to bug reports about them. |
Maybe but in general I think libgit2 development is lagging behind Git development. See: |
I can certainly believe that. |
I started an experimental module for this https://github.com/ubolonton/magit-libgit2 It currently advises Some notes:
|
@ubolonton I am taking a two week break, but am excited to look at this when I get back. |
Hi everyone, I was interested in working on this a bit. From what I see there are two cases of "prior art":
I played around with my own module here: https://github.com/TheBB/libegit2 Now since this is a relatively ambitious project, I'm wary to invite further fracturing, but in my defense, (a) I'm not comfortable with rust, (b) @ksjogo's repo seems abandoned, and (c) I had fun anyway. I'm aiming for a thin wrapper. If you're familiar with PyQt, it's possible to read Qt's C++ documentation and translate directly to Python. That's the level I'd like to aim for: you can read libgit2's C documentation and use it directly from Emacs with no go-between. I haven't yet tried to get magit to play with this module. What's the current status on your side? Should I continue working on this? |
That's exactly what I was hoping for and would have done eventually if you didn't beat me to it. But it would probably have taken me much longer than someone more familiar with C. So far this is pretty incomplete but it is very promising that you have already outlined your plans on what you do or don't intend to implement and that you have added documentation that allows others to contribute. I think I am going to add this to Emacs.g very soon - not just to the
I have only played with it a tiny bit. But that already confirms that this is easily installable (when using I am already quite sure that this is what I am going to use in Magit (*). If you would like to do that too, then I would like to welcome this project into the Beside the need for greater coverage, I think the most important tasks ahead are:
(*) I don't want to discourage other efforts though. (By the way, @ubolonton sorry for not getting back to you.) But I do favor this implementation not least because @TheBB has maintained other important Emacs projects before and because, as I said, his approach is pretty much what I had hoped for. It also has less of a proof-of-concept feel to it. |
Pinging some people who might be interested in contributing to this effort - @jwiegley @vermiculus @mgalgs @chriscool. |
(I've added some useful resources to my initial post above.) |
Great!
That's fair enough.
I'd be happy to.
If I can't use
If the accepted route for packages with compiled components is still the way pdf-tools does it, there's going to be some trying and failing to get that to work. :-s |
Would there still be interest in the 'external git RPC server' model? I had the idea the other day to write an RPC server with an interface where you just send it, for example, |
@deifactor you'd be implementing (part of) the git CLI on top of libgit2 which is probably valuable regardless of magit. The libgit2 folks are probably interested in something like that. In the context of magit, however, calling libgit2 via FFI is probably (slightly?) more efficient than RPC. Plus there's some amount of text parsing that could probably be skipped since I assume libgit2's data structures are more convenient to work with than git's text output. (But this is just that, an assumption; I've never actually used libgit2. Also, being a C API, much of the convenience may offset by C's clunkiness.) In any case, if you're motivated to work on the RPC approach rather than the libegit2 (FFI) approach... Sure, why not? :) |
Oh obviously it'd be better from magit's POV to use a libgit2 approach (directly or via emacs-ffi). But Rust has a better FFI story than Emacs... and I also like it more. :D Plus, like I said, I assume it'd be easier to swap since you'd only modify the commands that actually call git; the tradeoff being, as you said, that you can't do nicer parsing things. In any case, I'll update if I ever make enough progress to be worth talking about. |
I'm not a magit maintainer so I can't comment whether an RPC-based backend would be welcome, but at the very least your work could become a demonstration that the libgit2 approach (via FFI or RPC) is worthwhile. Or you might discover that this is not the (only) bottleneck. (For instance, GitExtensions on Windows also invokes lots of git.exe subprocesses, yet it's much faster than magit.) |
(After following the link I though this was a belayed april's fool prank, not just an awful example, and decided to ignore it.)
If that's something you would like to work on, then you should. I cannot guarantee that Magit will use it if you write it, but if you don't write it then it certainly won't. ;P If something like this already existed I would certainly experiment with it, but since it doesn't exist I haven't really thought about it since it last came up. Coming to the conclusion that using it would be the right thing to do but being unable to do so because it doesn't actually exists would have been frustrating, and there is so much more to do still. But again, this sounds interesting and if you write it I would enjoy experimenting with it but where that would lead, we won't know until we are there. |
Yeah I definitely wouldn't expect a hard 'yes we will definitely use this' without any actual code in hand, I just wanted to make sure you hadn't already decided 'magit will not use this even if it exists' for whatever reason. |
What's the status of this work? Magit on Windows is still unbearable :( by default taking 2s to refresh the status buffer (Surface Laptop4, on a basic git repo) . Still, I am hoping that with |
I am also still hoping that I will eventually start using That and the fact that both new, little or medium sized, feature requests for Magit and my numerous other packages keep coming in, and that there are also many other much more interesting features to work on that I have also been putting of for years. In other words, I still plan to do it eventually but when, I do not know. |
I've experimented with libgit and magit and I think for most git commands it's not necessary to switch to libgit. However the performance of commands like EDIT: git blame is very buggy in libgit and would have to be fixed |
@tarsius Maybe I can give a hand? Do you have some good pointers to start with / take a look at? |
Thanks @brotzeit ! Funny that the official website says |
It is, generally. libgit2's blame implementation is an exception. It was hurriedly ported from git and is not up to the quality bar of the rest of the library. |
I would take The problem with Where libgit would definitely make a difference is when refreshing the status buffer. All those numerous calls to What I believe is going to make a huge different implementing support for not refreshing everything all the time, and support for inserting (e.g. log and diff) sections asynchronously. The reason I haven't done that yet is that such so many requests for other things keep coming in all the time. Once more I am almost done working through the backlog of "not so important things, that should still be done eventually, and since each on of them doesn't take that long by itself, I might as well do it now" issues, and can soon focus on more interesting and impactful problems. But there are many of those too, and since I have doubts about the impact of using libgit compared to the other mentioned changes that would improve performance, working on libgit is not a priority. |
It is very like the language server protocol. Intresting. |
More and more companies are trying to actually use only Git, instead of relying on both Git and libgit2. For example for GitHub in: https://github.blog/2023-07-27-scaling-merge-ort-across-github/ they say: "Previously, we used libgit2 to tick these boxes: it was faster than Git’s default merge strategy and it didn’t require a working directory." "Two years ago, Git learned a new merge strategy, merge-ort. As the author details on the mailing list, merge-ort is fast, correct, and addresses many shortcomings of the older default strategy. Even better, unlike merge-recursive, it doesn’t need a working directory." "It was clear that GitHub needed to upgrade to merge-ort. We split this effort into two parts: first deploy merge-ort for merges, then deploy it for rebases." By the way, I am working on upstreaming |
This feels rather offtopic, but regardless, that's a solid maybe. GitHub may be moving over to merge-ort, but libgit2 spent nearly a decade handling your pull requests. 🤷 It's an interesting data point that GitLab wants to get rid of libgit2, but there are still plenty of people using libgit2 - whether by itself or combined with git - and more new applications are using it daily, for myriad reasons. |
Some Json-RPC server process makes the most sense to me if it's so hard to achieve async with git processes. It can also save the round trip in case of tramp. This de-abstraction endeavor with language binding won't solve the synchronous nature of git, as objects in libgit are not thread-safe. We could use libgit in a such rpc server process, though. I don't know if a such experimental merge request will be considered by the maintainers, but I'm happy to make an attempt to make some demo commit. I don't what part is too slow for the users but I think It's also worth considering contributing to git upstream to have some nice interface that magit and others can utilise more efficiently |
Having a POC would be nice, I would certainly try it out. Merging would be unlikely until it goes way beyond just a POC, but that's where we would have to start. |
@kohnish any updates on the JSON-RPC server? |
@kohnish what language are you planning to use? Seems C++ has a good support: https://github.com/jsonrpcx/json-rpc-cxx |
@tarsius are there any recommendations on what languages / library licenses should be used for executables? What is the standard for the programs that Emacs packages depend on? Are they supposed to be easily compiled on any platform, on demand, from Emacs? |
Rust seems popular nowadays. C and Python might be good options too. C++ probably too.
Yes.
Offering that is okay, but having such a complex build process, that most users would end up having to use the provided binaries, is not. |
I wasn't at the stage of implementing yet. I prefer emacs-libvterm like distribution where in the best case scenario, it manages to compile from source without the internet. The challenge for me is read magit source and come up with a single message to emacs to draw all git-blame results in the same way as it is now. During the development, it could be some stubbed implementation in any language. I'm also not too familiar with both emacs IPC and UI APIs. Magit-blame draws the buffer so wonderfully. I'm thinking of looking into git blame, because that's the only thing that is very slow or never ends over tramp. |
Do you have any language preference for the server JSON-RPC implementation? @kohnish |
Not really. As long as it compiles fast and runs relatively fast, like c and go. |
Throwing my 2c in here: in my experience, Rust has been the only non-.NET, compiled language that really hasn't been a bear to configure for building on Windows (provided a working MSVC, which has a well-greased installation process with Visual Studio). If one of the goals is to have a straightforward process for building from source on-demand, Rust would be a good choice for cross-platform compatibility. After all, the performance issues being addressed are most evident on Windows. But of course the best choice will be something in which it actually gets written and maintained ;-) |
This description was taken from #2956. I intend to replace it with a more in-depth description at a later time.
Magit is slow and part of fixing that involves the use of
libgit2
, "a portable, pure C implementation of the Git core methods provided as a re-entrant linkable library with a solid API, allowing you to write native speed custom Git applications in any language which supports C bindings." Unfortunately nobody has written that for Elisp yet and since improving performance is a top priority now, I'll to it.This will be named just
libgit.el
(orlibgit2.el
) and be pretty basic, i.e. just expose the functions provided bylibgit2
to Elisp.Older discussions: #2539, #2442 (comment), #1327 (comment). (Yes, this goes back a while, but note that doing this is only even possible since Emacs v25.1, which was released in September 2016.)
Some resources:
And of course...
The text was updated successfully, but these errors were encountered: