Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarifying Lexical Comparison for Identifier Precedence (Again) #970

Open
BenjaminHolland opened this issue Aug 31, 2023 · 2 comments
Open

Comments

@BenjaminHolland
Copy link

I've looked at #832 and #561, and I'm still not sure how to resolve a comparison like

1.0.0-alpha.1
1.0.0-1.alpha

Is it safe to assume that the comparison here will be alphanumeric, even though one field in each comparison is numeric?

@hoelzeli
Copy link

hoelzeli commented Sep 3, 2023

I think so, yes. 11.4 is the relevant section:

[...] Precedence for two pre-release versions with the same major, minor, and patch version MUST be determined by comparing each dot separated identifier from left to right until a difference is found [...]

So the prerelease version will be split into the identifiers like so (written as python arrays and python strings):
1.0.0-alpha.1 --> prerelease identifiers: ["alpha", 1]
1.0.0-1.alpha --> prerelease identifiers: [1, "alpha"]

The comparisons would therefore be:

  1. "alpha" <--> 1
  2. 1 <--> "alpha"

The second comparison will never be reached because "alpha" takes precedence over 1 (section 11.4.3)

@jwdonahue
Copy link
Contributor

jwdonahue commented Sep 14, 2023

So first we have #9:

A pre-release version MAY be denoted by appending a hyphen and a series of dot separated identifiers immediately following the patch version. Identifiers MUST comprise only ASCII alphanumerics and hyphens [0-9A-Za-z-]. Identifiers MUST NOT be empty. Numeric identifiers MUST NOT include leading zeroes. Pre-release versions have a lower precedence than the associated normal version. A pre-release version indicates that the version is unstable and might not satisfy the intended compatibility requirements as denoted by its associated normal version. Examples: 1.0.0-alpha, 1.0.0-alpha.1, 1.0.0-0.3.7, 1.0.0-x.7.z.92, 1.0.0-x-y-z.--.

Then there's #11:

Precedence for two pre-release versions with the same major, minor, and patch version MUST be determined by comparing each dot separated identifier from left to right until a difference is found as follows:

Identifiers consisting of only digits are compared numerically.
Identifiers with letters or hyphens are compared lexically in ASCII sort order.
Numeric identifiers always have lower precedence than non-numeric identifiers.

Fortunately, the ASCII codes for the digits [0..9] are lower than the codes for [a..z] and [A..Z], so as long as your string consists of ASCII or UTF-8 characters in those character ranges, you can simply compare each character of each field from left to right until you encounter a character that is greater, or the field has more characters. So:

'a' > '1' (ie; 97 > 49), and your done.

So there's no need to ever convert a numeric identifier to a scaler like int or long, it's just wasteful.

I do always split the string on the dots and make sure the first three fields are pure numeric characters and the remaining fields are in the expected character ranges, before doing any comparisons. While not explicitly stated in the spec, the SemVer rules do not apply to non-SemVer strings. Such comparisons require consideration of implicit and explicit semantics, if any, of the non-SemVer string and may also require attention to cultures.

See also:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants
@BenjaminHolland @jwdonahue @hoelzeli and others