cli: add `jj fix` proof of concept #3647

martinvonz · 2024-05-08T07:08:34Z

Checklist

If applicable:

I have updated CHANGELOG.md
I have updated the documentation (README.md, docs/, demos/)
I have updated the config schema (cli/src/config-schema.json)
I have added tests to cover my changes

arxanas

I don't know if there's context for this PR or if you're seeking review from anyone 🙂.

arxanas · 2024-05-11T23:41:00Z

cli/src/commands/fix.rs

+    tx.finish(ui, format!("fixed {} commits", root_commits.len()))
+}
+
+fn format_files<'a>(


comment: Ultimately, we presumably want to program against an interface dealing with commits as an intermediary between jj fix and the formatter, which I guess would eventually arrive as part of jj run?

For reference, here's what the git-branchless interface currently looks like (it's used in git test but also in git submit for the Phabricator backend): https://github.com/arxanas/git-branchless/blob/e87ab50603fe40e1285bba574d9d58e7b9c88baa/git-branchless-test/src/lib.rs#L1204-L1215.

(One thing I would change about the interface is to not necessarily pass in an entire slice of Commits, and use only a lazily-evaluated revset of some sort, for the case of searching commits. It's also actually relevant for formatting commits, if your objective is to find the first badly-formatted commit and apply the formatting as a patch, rather than formatting all commits in a stateless manner — both are meaningful strategies.)

That interface only supports external commands, but I assume that jj also wants to support in-process commands/functions. In either case, I think there's not too much information that we need to pass to the command/function, which would form the interface of a "command" that can be run as part of jj run/jj fix:

The ID of the commit being processed.

The IDs of ancestors of that commit which are also roots of the commit set being formatted.

If literally running a command in a workspace, then a handle to the workspace.

For convenience, perhaps also the following, even though they can be computed directly from the IDs of all the involved commits?

The set of files changed compared to parents.

The set of files changed compared to roots.

The approach as written in this PR deduplicates requests to format the same file, but I would encourage to not make that the direct interface, as it's not general enough to handle certain classes of fix. If the formatter function is actually running in-process, then it shouldn't be too hard to recover the caching at a different level of abstraction instead?

I have some questions

(One thing I would change about the interface is to not necessarily pass in an entire slice of Commits, and use only a lazily-evaluated revset of some sort, for the case of searching commits. It's also actually relevant for formatting commits, if your objective is to find the first badly-formatted commit and apply the formatting as a patch, rather than formatting all commits in a stateless manner — both are meaningful strategies.)

Can you explain your idea a bit more, as with run we try to cache commits somewhere, either on disk or depending on backend somewhere else.

That interface only supports external commands, but I assume that jj also wants to support in-process commands/functions. In either case, I think there's not too much information that we need to pass to the command/function, which would form the interface of a "command" that can be run as part of jj run/jj fix:

The ID of the commit being processed.

The IDs of ancestors of that commit which are also roots of the commit set being formatted.

If literally running a command in a workspace, then a handle to the workspace.

For convenience, perhaps also the following, even though they can be computed directly from the IDs of all the involved commits?

The set of files changed compared to parents.

The set of files changed compared to roots.

This looks like a really good idea, but how should that interface look? Should there be some Context which is passed into the inner run impl?

The approach as written in this PR deduplicates requests to format the same file, but I would encourage to not make that the direct interface, as it's not general enough to handle certain classes of fix. If the formatter function is actually running in-process, then it shouldn't be too hard to recover the caching at a different level of abstraction instead?

jj fix is specialized for formatting individual files (does not support formatters that need to inspect multiple files) and can therefore avoid materializing working copies. Maybe we can rewrite it using the jj run infrastructure once we have some virtual file system support so we don't need to materialize the working copies. Does that sound okay?

It's not about working copies specifically, but rather about organizing around commits rather than files:

organizing around commits will be better for the eventual jj run integration

there are purely in-memory reasons to operate on commits for jj fix

some formatters are designed to operate specifically on changed sections of code (particularly useful for transition periods between not having and having a formatter, or upgrading the formatter style)

there are some meaningful operations to do on commit messages as well, like inserting a change ID or word wrapping (I guess?)

error reporting is almost certainly going to want to be per-commit, not per-file — you want to know which commit to go to in order to fix a syntax error

Therefore I recommend that even the current iteration be designed around commits.

There's a lot to unpack here, but I can say much of it has come up already in our internal use of hg fix, and my plans for jj fix aren't much different from what we converged on there. I don't yet see a way of totally unifying fix and run that seems worth it to me, but I do want jj run --fix -r <rev> <command> to work as an alternative to jj fix for commands that need more context than formatters.

The point about error reporting is interesting, because there can be multiple root commits where a syntax error would need to be fixed. So you can't just pick a good one to associate the error message with. hg fix just prints the error once for every commit where it occurs. Once we add support for formatting changed line ranges, there will be less possible de-duplication of work across commits, but it will still be quite beneficial to some users if we can do O(changes) work instead of O(changes*commits) for chains of commits with disjoint sets of modified files.

some formatters are designed to operate specifically on changed sections of code (particularly useful for transition periods between not having and having a formatter, or upgrading the formatter style)

@hooper and I talked a bit about this the other day. I think we can do it by still fully formatting both sides of the diff and then keep only hunks that overlap with hunks in the diff. Consequences:

Works independent of formatter (no need for formatter to support partial formatting, no need for jj fix to provide a way to pass in line ranges)

Wastefully formats entire files

If formatting results in a large chunk of the file changing (e.g. reflowing), then this algorithm will make the reformatting apply to a large chunk too, for better or worse (I don't know how formatters that support partial formatting treat this case)

arxanas · 2024-05-11T23:44:15Z

cli/src/commands/fix.rs

+
+    let formatted = format_files(tx.repo().store().as_ref(), &file_ids)?;
+
+    tx.mut_repo().transform_descendants(


comment: One thing to consider is how to recover from failures. My initial expectation was that I would want the entire process to fail if processing any commit failed. But, so far in my experience running git test fix, if processing one or more commits fails —

I've still actually always wanted to apply any fixes for commits whose ancestors did not fail.

In some but not all cases, I've wanted to apply fixes for commits that passed, even if some ancestor failed.

I don't think I've ever wanted to apply a fix if the command for it returned a non-zero exit status, even if it did successfully format some files on disk.

Good point. I suspect @hooper has thought about that but I don't think I have. I think what you say makes sense. I suppose the effect is that.

The approach that has worked well for us in hg fix is to only use the output if the exit code is zero, and to print any stderr lines with a prefix that indicates which commit/file caused it. Part of this problem is just meeting the formatting tools where they are instead of trying to negotiate some new contract with them.

martinvonz · 2024-05-11T23:50:54Z

I don't know if there's context for this PR or if you're seeking review from anyone 🙂.

Sorry that I didn't provide context. I wrote this as a starting point for @hooper who is going to finish this PR (or duplicate it).

martinvonz force-pushed the push-ypqvxpyoowtm branch 2 times, most recently from 666a710 to 9247a5f Compare May 8, 2024 07:14

cli: add jj fix proof of concept

de459a7

martinvonz force-pushed the push-ypqvxpyoowtm branch from 9247a5f to de459a7 Compare May 8, 2024 16:58

arxanas reviewed May 11, 2024

View reviewed changes

PhilipMetzger mentioned this pull request May 15, 2024

FR: Generalized hook support #3577

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

cli: add `jj fix` proof of concept #3647

cli: add `jj fix` proof of concept #3647

martinvonz commented May 8, 2024

arxanas left a comment

arxanas May 11, 2024

PhilipMetzger May 12, 2024

martinvonz May 14, 2024

arxanas May 16, 2024

hooper May 16, 2024

martinvonz May 16, 2024

arxanas May 11, 2024

martinvonz May 12, 2024

hooper May 16, 2024

martinvonz commented May 11, 2024


		let formatted = format_files(tx.repo().store().as_ref(), &file_ids)?;

		tx.mut_repo().transform_descendants(

cli: add jj fix proof of concept #3647

Are you sure you want to change the base?

cli: add jj fix proof of concept #3647

Conversation

martinvonz commented May 8, 2024

Checklist

arxanas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

martinvonz commented May 11, 2024

cli: add `jj fix` proof of concept #3647

cli: add `jj fix` proof of concept #3647