Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rebase: move_commits: do not remove parents of target commits which are outside the target set #3605

Closed
wants to merge 4 commits into from

Conversation

bnjmnt4n
Copy link
Collaborator

Checklist

If applicable:

  • I have updated CHANGELOG.md
  • I have updated the documentation (README.md, docs/, demos/)
  • I have updated the config schema (cli/src/config-schema.json)
  • I have added tests to cover my changes

new_parents.push(parent_id.clone());
}
}
new_parents
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bit odd that the target root is special cased (even if it was a merge.)

For example, rebasing d::e off a|b removes a|b from d, but not from e. Isn't it better to exclude all ancestors of the target roots?

    create_commit(&test_env, &repo_path, "a", &[]);
    create_commit(&test_env, &repo_path, "b", &[]);
    create_commit(&test_env, &repo_path, "c", &[]);
    create_commit(&test_env, &repo_path, "d", &["a", "b"]);
    create_commit(&test_env, &repo_path, "e", &["d", "a", "b"]);
    insta::assert_snapshot!(get_log_output(&test_env, &repo_path), @r###"
    @      e
    ├─┬─╮
    ◉ │ │  d
    ╰─┬─╮
      │ ◉  b
      ◉ │  a
      ├─╯
    ◉ │  c
    ├─╯
    ◉
    "###);

    test_env.jj_cmd_ok(&repo_path, &["rebase", "-r", "d::e", "-d", "c"]);
    insta::assert_snapshot!(get_log_output(&test_env, &repo_path), @r###"
    @      e
    ├─┬─╮
    │ │ ◉  b
    │ ◉ │  a
    │ ├─╯
    ◉ │  d
    ◉ │  c
    ├─╯
    ◉
    "###);

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can see where you're coming from, it might also makes sense to treat all the revisions passed as an independent set and move it directly to the destination whilst removing any links to other commits. Here's the thread in Discord where I discussed this: https://discord.com/channels/968932220549103686/1234553998703460423.

The example described shows that with the current behavior, rebasing -r s:: would differ from -s s:

❯ jj l
@    vwysomoy benjamin@dev.ofcr.se 1 second ago 5331ba13
├─╮  (empty) d
○ │  kurkunwm benjamin@dev.ofcr.se 30 seconds ago 8d3b09e0
│ │  (empty) c
○ │  svotuppn benjamin@dev.ofcr.se 34 seconds ago 001cebf5
├─╯  (empty) b
○  ksnsnuxr benjamin@dev.ofcr.se 43 seconds ago e55b9272
│  (empty) a
◆  zzzzzzzz root() 00000000

❯ jj rebase -r s:: -d ks
Skipping rebase of commits:
  svotuppn 001cebf5 (empty) b
  kurkunwm 8d3b09e0 (empty) c
Rebased 1 commits onto destination
Working copy now at: vwysomoy ac20feeb (empty) d
Parent commit      : kurkunwm 8d3b09e0 (empty) c

❯ jj l
@  vwysomoy benjamin@dev.ofcr.se 1 second ago ac20feeb
│  (empty) d
○  kurkunwm benjamin@dev.ofcr.se 1 minute ago 8d3b09e0
│  (empty) c
○  svotuppn benjamin@dev.ofcr.se 1 minute ago 001cebf5
│  (empty) b
○  ksnsnuxr benjamin@dev.ofcr.se 1 minute ago e55b9272
│  (empty) a
◆  zzzzzzzz root() 00000000

❯ jj undo
Working copy now at: vwysomoy 5331ba13 (empty) d
Parent commit      : kurkunwm 8d3b09e0 (empty) c
Parent commit      : ksnsnuxr e55b9272 (empty) a

❯ jj rebase -s s -d ks
Skipping rebase of commit svotuppn 001cebf5 (empty) b

❯ jj l
@    vwysomoy benjamin@dev.ofcr.se 1 minute ago 5331ba13
├─╮  (empty) d
○ │  kurkunwm benjamin@dev.ofcr.se 1 minute ago 8d3b09e0
│ │  (empty) c
○ │  svotuppn benjamin@dev.ofcr.se 1 minute ago 001cebf5
├─╯  (empty) b
○  ksnsnuxr benjamin@dev.ofcr.se 1 minute ago e55b9272
│  (empty) a
◆  zzzzzzzz root() 00000000

I thought that might be a little inconsistent, but as @martinvonz pointed out we can also explicitly make the behavior different for -r and -s/-b.

Any other opinions on the desired behavior for this?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @yuja 's comment, I think of it in terms of a more realistic example:

@    f
├─╮
│ ○  e
│ ○  d
○ │  c
○ │  b
├─╯
○  a
◆

There's a main branch b and c and a feature branch d and e that got merged as f. If we were to rebase c and f to some unrelated part of the commit graph, I would expect f to still have a as an ancestor (through e and d).

I don't think the logic should change if a were instead a direct parent of f (that is, e and d are missing).

Regarding @bnjmnt4n 's question, I did not read the Discord thread, but think that -r s:: should have the same behavior to -s s if possible.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The example described shows that with the current behavior, rebasing -r s:: would differ from -s s:

I see. Even though it's inconsistent, I agree that making -r s:: identical to -s s is practically useful.
Perhaps, that's because -r specifies nodes to rebase, whereas -s specifies edges to break. We need to deduce edges from -r sets, so the ambiguity is unavoidable.

Can you include the reason behind this change in the commit message?

// parents, except the roots of the set, which persist their original
// parents.
// Compute the parents of all commits in the connected target set,
// allowing only commits in the target set as parents.
// If a commit in the set has a parent which is not in the set, but has
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which set is "the set" (target commits or connected target commits)?

// parents, except the roots of the set, which persist their original
// parents.
// Compute the parents of all commits in the connected target set,
// allowing only commits in the target set as parents.
// If a commit in the set has a parent which is not in the set, but has
// an ancestor which is in the set, then the commit will have that ancestor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"an ancestor which is in the set" -> "this parent has an ancestor which is a target commit", I think

Also, I think this algorithm needs to be more precisely defined. Consider this graph:

@    f
├─╮
│ ○    e
│ ├─╮
│ │ ○  c
○ │ │  d
├─╯ │
○   │  b
├───╯
○  a
◆

Let's say everything except e is a target commit. Let's look at f.

If a commit in the set has a parent which is not in the set

Yes, f is a commit in the (target commits) set and it has a parent e which is not in the (target commits) set.

but has an ancestor which is in the set

Yes, its ancestor a is in the (target commits) set.

then the commit will have that ancestor as a parent.

I don't think the intention is to have a as a parent, since it already has d, b, and c as parents. (Or maybe only d and c, I'm not sure, but in any case I don't think that a should be a parent.)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A better description is "If a commit in the target set has a parent which is not in the target set but has an ancestor in the target set, the parent's closest ancestors which are in the target set will be added to the commit's parents."

In the case of your example, f would have parents d (direct parent in the target set), as well as b and c (e's parents which are in the target set).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Closest ancestors" is not well-defined either. Let me add a commit z (not a target commit) before f:

@    f
○    z
├─╮
│ ○    e
│ ├─╮
│ │ ○  c
○ │ │  d
├─╯ │
○   │  b
├───╯
○  a
◆

So a commit in the target set f has a parent that is not in the target set z. What are the parent's (that is, z's) closest ancestors? It could arguably be (1) d b c (as you describe) or (2) d c (b is excluded because d is indisputably closer than b).

After some thought, the best way I could describe both would be:
(1) any ancestor of f that is a target commit and whose ancestry path (from f) does not go through another target commit
(2) the heads of the (the intersection of the ancestors of f and the target set)

Of the two, I prefer (2) because it reuses terminology from "Revset language", which means that it's easier to understand and test. (Putting myself in the user's shoes, this is a really contrived example, so I think either is reasonable, as long as we are consistent.) But I think this means rewriting the code (as far as I can tell, the code both before and after this PR implements (1)). I can't think of an efficient algorithm to do (2) off the top of my head, unfortunately. (Git does something similar during fetch negotiation, but the output is one big set of "bottom" commits; here, we need one set of "bottom" commits per target commit, so we cannot directly use Git's algorithm.)

}
target_commits_internal_parents.retain(|id, _| target_commit_ids.contains(id));

// Compute the roots of `target_commits`.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same as using roots(), I presume? Maybe worth mentioning it here in the comment that we're not using that because it is more efficient to reuse our previous calculation.

new_parents.push(parent_id.clone());
}
}
new_parents
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To @yuja 's comment, I think of it in terms of a more realistic example:

@    f
├─╮
│ ○  e
│ ○  d
○ │  c
○ │  b
├─╯
○  a
◆

There's a main branch b and c and a feature branch d and e that got merged as f. If we were to rebase c and f to some unrelated part of the commit graph, I would expect f to still have a as an ancestor (through e and d).

I don't think the logic should change if a were instead a direct parent of f (that is, e and d are missing).

Regarding @bnjmnt4n 's question, I did not read the Discord thread, but think that -r s:: should have the same behavior to -s s if possible.

cli/tests/test_rebase_command.rs Show resolved Hide resolved
// 3. Keep other parents outside the target set if they are not descendants of the
// new children of the target set.
else {
let mut new_parents = vec![];
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps, new_parents can be precomputed in a way that skipped ancestor edges are also preserved?

    create_commit(&test_env, &repo_path, "a", &[]);
    create_commit(&test_env, &repo_path, "b", &["a"]);
    create_commit(&test_env, &repo_path, "c", &["a"]);
    create_commit(&test_env, &repo_path, "d", &["a"]);
    create_commit(&test_env, &repo_path, "e", &["c", "b"]);
    create_commit(&test_env, &repo_path, "f", &["e"]);
    insta::assert_snapshot!(get_log_output(&test_env, &repo_path), @r###"
    @  f
    ◉    e
    ├─╮
    │ ◉  b
    ◉ │  c
    ├─╯
    │ ◉  d
    ├─╯
    ◉  a
    ◉
    "###);

    // TODO: edge f->e->b is lost
    test_env.jj_cmd_ok(&repo_path, &["rebase", "-r", "c|f", "-d", "d"]);
    insta::assert_snapshot!(get_log_output(&test_env, &repo_path), @r###"
    @  f
    ◉  c
    ◉  d
    │ ◉  e
    ╭─┤
    │ ◉  b
    ├─╯
    ◉  a
    ◉
    "###);

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting case. I don't think I would actually expect the dependency on b to be preserved.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My thought process for the implementation is that when selecting the target commits to rebase, we only want to preserve ancestry information for those commits, not any other commits. However, I'm happy to change the implementation if a consensus is reached. My personal use of this feature has largely been limited to rebasing linear stacks of commits to before/after other locations, so maybe others with different usecases might want to make suggestions as well.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, in my mental model, rebase -r c|f is roughly equivalent to rebase -r c::f and abandon e'. So I expect the f->b edge is preserved so long as rebase -r c::f works that way.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just commenting why I think the f->e->b edge should be preserved:

  1. disconnected -r targets are rebased as if they were all connected
  2. in order to achieve that, internal edges across skipped nodes are preserved
  3. so it seems better to apply the same rule to external edges

In practice, I wouldn't try this kind of weird rebase, and I wouldn't mind if f->e->b was removed. My point is that it's easier to reason about the behavior if we have a consistent model (such as rebase -r c|f is equivalent to rebase --keep -r c::f + abandon c|f + abandon e'.)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the other reasonable behavior is to drop this PR and let -s x and -r x:: be different.

That sounds also good to me.

It's weird to me if rebase -r c|f -d a preserves the edge but rebase -r f -d a does not. It doesn't seem that f should depend on b any more because we're also rebasing c.

Since rebase -r c|f -d a is not identical to rebase -r c -d a && rebase -r f -d a, I won't be surprised if -r c|f preserve the f->e->b edge, but -r f doesn't.

Yet another option is to make -s not preserve external edges. I copied it from Mercurial but I'm not sure how useful that behavior is.

I occasionally do rebase -s one branch whose descendants are merged with the other branches. In this case, the current rebase -s behavior is convenient.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on this discussion, should I close this PR? I can open a separate one which moves the move_commits function into the rewrite module.

Also, is #3607 still desired? (I was working initially working on this PR to make the behavior of rebase -r/-s consistent before migrating rebase -s to the move_commits function to add support for --before/after.)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, my inclination is to drop this PR and let jj rebase -s foo and jj rebase -r foo:: behave differently. Sorry to have wasted your time :(

Also, is #3607 still desired?

I think it's enough to support -A/-B with -r, and it might get complicated to support it with -s and its different behavior for merges into the set.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apart from --insert-before/after, I was thinking of rebase --keep as well. I was planning to migrate the existing rebase -s/-b to use move_commits, then from there modify move_commits to allow keeping the original target commits as well. WDYT about that? (I don't think think it will be very complicated to add a flag to persist external merge parents for -s/-b.)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That all sounds good to me. Thanks!

} else if let Some(parents) =
connected_target_commits_internal_parents.get(parent_id) {
new_parents.extend(parents.iter().cloned());
} else if !new_children.iter().any(|new_child| {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an error, but I'm not sure if we handle other loop cases differently.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have this code for handling loops already:

// If the new parents include a commit in the target set, replace it with the
// commit's ancestors which are outside the set.
// e.g. `jj rebase -r A --before A`
let new_parent_ids: Vec<_> = new_parent_ids
.iter()
.flat_map(|parent_id| {
if let Some(parent_ids) = target_commits_external_parents.get(parent_id) {
parent_ids.iter().cloned().collect_vec()
} else {
[parent_id.clone()].to_vec()
}
})
.collect();
// If the new children include a commit in the target set, replace it with the
// commit's descendants which are outside the set.
// e.g. `jj rebase -r A --after A`
let new_children: Vec<_> = if new_children
.iter()
.any(|child| target_commit_ids.contains(child.id()))
{
let target_commits_descendants: Vec<_> =
RevsetExpression::commits(target_commit_ids.iter().cloned().collect_vec())
.union(
&RevsetExpression::commits(target_commit_ids.iter().cloned().collect_vec())
.children(),
)
.evaluate_programmatic(mut_repo)?
.iter()
.commits(mut_repo.store())
.try_collect()?;

I'm not sure it's worth having that code because it also seems like strange behavior to support but I don't feel strongly. My impression was that @bnjmnt4n also didn't feel strongly either way.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an error, but I'm not sure if we handle other loop cases differently.

In this case, I feel like the user's intent in calling rebase --before/--after is to move the commits to the new location, making whatever changes to the commit graph are necessary, so this was acceptable to me. --before/--after allows you to move the root of the target set before its original parents, so I thought moving the other commits of the target set before any of its original parents is acceptable.

cli/src/commands/rebase.rs Show resolved Hide resolved
I think they were adding too much noise to commit diffs. Only the tests
focused on skipping rebasing will include the commit and change IDs,
other tests will omit them.
This is in preparation for shifting of `move_commits` function to
`jj_lib::rewrite`.
…h are outside the target set

This ensures consistency between the commands `jj rebase -r a::` and `jj
rebase -s a`.
@bnjmnt4n bnjmnt4n force-pushed the bnjmnt4n/push-nwrmulnsxpou branch from d46b7b7 to 3ca78be Compare May 2, 2024 08:06
@bnjmnt4n bnjmnt4n closed this May 15, 2024
@bnjmnt4n bnjmnt4n deleted the bnjmnt4n/push-nwrmulnsxpou branch May 15, 2024 20:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants