Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when sorting, break ties with lexical comparison #206

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

krancour
Copy link

@krancour krancour commented Apr 25, 2023

The manner in which comparisons are evaluated is correct. This PR doesn't seek to change that. 1.24 is equivalent to 1.24.0, however, when sorting lists that contain both these values, those two may appear in either order in relation to one another.

This PR breaks ties (only when sorting) through a lexical comparison of the original strings, thereby enabling deterministic sort results.

Signed-off-by: Kent <kent.rancourt@gmail.com>
@krancour krancour changed the title when sorting break ties with lexical comparison when sorting, break ties with lexical comparison Apr 25, 2023
@mattfarina
Copy link
Member

I looked at this with testing and I noticed that the order of the output is based on the order of the input.

For example, if I have an input of the following:

raw := []string{
	"5.6.0-alpha1",
	"1.2.3",
	"1.3",
	"1.0",
	"1.3.0",
	"2",
	"0.4.2",
	"5.6-alpha1",
}

The output ordering in the present codebase would be:

[
    0.4.2
    1.0
    1.2.3
    1.3
    1.3.0
    2
    5.6.0-alpha1
    5.6-alpha1
]

The first on the input list is the first on the output list.

Why change from that?

@krancour
Copy link
Author

@mattfarina thanks for the response.

I looked at this with testing and I noticed that the order of the output is based on the order of the input.

I believe you may have lucked into that result. Admittedly, my best efforts to engineer a counter-example are failing, but your package doesn't offer any guarantees of the behavior you are observing. I don't know how it could since you do not implement the sorting algorithm yourself. You only implement comparison operations used in sorting.

Go currently uses the PDQ sort -- a variation on a quicksort -- and I cannot see where that algorithm would guarantee the behavior you are observing either. Even if it coincidentally did, it's only as of fairly recently that Go uses PDQ and who's to say that their default sorting algorithm cannot change again in the future?

Beyond this, I would strongly assert that the whole point of sorting is to impose a deterministic order on unordered data and that the original order of the input should have no bearing on the result. In other words, why shouldn't sorting [1.1, 1.1.0] and [1.1.0, 1.1] yield the same result? To emphasize this further, consider that the order of the input cannot be controlled in all cases.

I can also point to an actual occurrence of the present behavior causing a bug in the wild:

argoproj-labs/argocd-image-updater#375

I can only speculate what conditions may have precipitated that. Perhaps the initial order of the input was itself not consistent between runs, but as I said, it defeats the point of sorting if the order you start with matters.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants