Skip to content

Using git to prepare your PR to have a clean history

spark-c edited this page Sep 18, 2021 · 4 revisions

Using git to prepare your PR to have a clean history

Tests are a valuable asset in any project -- as is a clean version history. Having proper and clean commits helps to review and understand changes. Git is an awesome tool that let you control how you want your history. GitHub just uses the history of the branch, so each time the commit history is changed, the PR is updated. GitHub PRs are a fantastic way to have changes reviewed before merging.

The following tips may help you if you are not proficient in Git, but they aren't a comprehensive tutorial; knowing Git is the most bullet-proof skill. But these steps can help guide you towards a clean history!

There are multiple blog entries explaining Git -- I find the following two posts quite compelling, as they show a modern way to do things properly. Of course, StackOverflow is another great resource.

Note these action rewrite history(!!) and it's easy to get lost. With Git, there multiple ways to fix prevent errors; for example, consider creating a backup branch before starting something that may go wrong.

The most common ways to rewrite history are amending, rebasing interactively, and squashing.

amending

You can amend commits if they don't add semantic value to the PR. For example an history like this in your own PR branch can be avoided.

For example:

* 5a03bf5 - fix another bug
* ccc50cc - fix npe stuff
* 18d8ace - Added new awesome feature

If the latest two "fix" commits were amended with with the following command:

git commit --amend --no-edit
# --no-edit tells Git not to edit the message of the previous commit

You will end up with a single, meaningful commit!

* 98a826e - Added new awesome feature

rebasing interactively

If for some reason a change has to be made afterward, for whatever reason -- review, a bug fix, adding missing stuff, javadoc etc. -- The history may then be chaotic.

* a2d6ee9 - Review fixes
* eed23a1 - Complete awesome feature with better error message
* 84aaad2 - wip
* 24fe90a - Merge from somewhere else
* 5a03bf5 - fix another bug
* ccc50cc - fix npe stuff
* 18d8ace - Added new awesome feature

Multiple commits like this can just create noise in the PR, and they don't help to understand the change; especially when the history must be referenced later in the future. To clean that history, you have to use Git's interactive rebase.

git rebase --interactive HEAD~7

It will open an editor with the 7 chosen commits (in the reverse order):

pick 18d8ace Added new awesome feature
pick ccc50cc fix npe stuff
pick 5a03bf5 fix another bug
pick 24fe90a Merge from somewhere else
pick 84aaad2 wip
pick eed23a1 Complete awesome feature with better error message
pick a2d6ee9 Review fixes

For each line (one commit per line) you can apply a different action:

pick 18d8ace Added new awesome feature
fixup ccc50cc fix npe stuff
fixup 5a03bf5 fix another bug
pick 24fe90a Merge from somewhere else
squash 84aaad2 wip
squash eed23a1 Complete awesome feature with better error message
pick a2d6ee9 Review fixes

In the above code block, I wrote several actions that git will perform interactively: The fixup command tells git to amend the previous commit without changing its message. The squash command tells git to amend the previous commit, and ask in an editor for a new commit message. This will result in the following history:

* 72d6ac9 - Review fixes
* abe934f - Better error messages for awesome feature
* 73a6abe - Added new awesome feature

just squashing

Sometime however having a single commit is just the right thing to do. With the following history:

* a2d6ee9 - Review fixes
* eed23a1 - Complete awesome feature with better error message
* 84aaad2 - wip
* 24fe90a - Merge from somewhere else
* 5a03bf5 - fix another bug
* ccc50cc - fix npe stuff
* 18d8ace - Added new awesome feature

You can do that like so:

git reset --soft HEAD~7
git commit --message="Adds awesome feature"

pushing the branch with modified history

If everything is okay, perform a push force. It is required, as the history has changed and Git normally forbids pushing that modified history to the remote. Using --force assures Git that we intend to rewrite history, and will allow the operation.

The following command specifies the remote-name and the branch-name; it is important if using Git with a version prior to 2.0.0. (see more about the push.default config)

git push -f origin pr_that_improve_stuffs

Now you should see changes in the PR.