Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add specification for SemVer Ranges #584

Open
wants to merge 1 commit into
base: master
Choose a base branch
from
Open

Conversation

isaacs
Copy link
Contributor

@isaacs isaacs commented Jun 22, 2020

This is just a first draft, and I'm sure there's still a lot of explanatory prose that could be really helpful. Some open questions:

  • @steveklabnik Does this reflect what the semver crate is doing with range parsing?
  • Other implementors: how far is this spec from what you're doing with semver ranges? Would it be onerous or disruptive to implement in this way? Does anything conflict with your system's behavior today, or are there other features you've implemented that should go in this spec, so we can drive to consistency? (I seem to recall that ~ in particular is subtly different between rubygems/bundler and npm?)
  • Does this belong in semver.org, or should it be its own standalone thing, like semver-ranges.org or something? (I'd definitely like it to be owned by this group, whether it's on this specific domain name or another one.)
  • What questions does it raise for you that you'd like answered? What questions do you expect from others in your communities?

(Note: originally posted incorrectly at semver/semver.org#280, moved over here where it's more appropriate.)

@isaacs isaacs added consensus seeking The discussion is not over yet extend Brand new ideas/rules to add to the specification RFC Request for comments state for next version labels Jun 22, 2020
ranges.md Outdated
License
-------

[Creative Commons ― CC BY 3.0](http://creativecommons.org/licenses/by/3.0/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest linking to creative commons using https

ranges.md Outdated
* `>` Greater than
* `>=` Greater than or equal to
* `=` Equal. If no operator is specified, then equality is assumed,
so this operator is optional, but _may_ be included to differentiate it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be MAY ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Thanks. (Switching between different styles in different rfc repos.)

ranges.md Outdated

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD",
"SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be
interpreted as described in [RFC 2119](http://tools.ietf.org/html/rfc2119).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tools.ietf.org should be linked to using https

ranges.md Outdated
2.0.0](https://semver.org/spec/v2.0.0.html)

1. A SemVer Range is a set of 1 or more `Comparator Sets`, joined by `||`.
A SemVer Version String is included by the SemVer Range if it is
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On line 84, String is lowercase, and the definitionts on 73-75 only include SemVer and SemVer Version, implying that the String is not a part of the official term and should be lowercase

ranges.md Outdated
resolved with the relevant operator attached as described above
in this section.

1. SemVer Build metadata (that is, identifiers prefixed by `+`) are not
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is metadata singular or plural? If singular, it should be is not, but since data is plural metadata is arguably a plural. Not sure

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Data" and "metadata" are mass nouns, so singular form is always appropriate. This is a valid suggestion.

ranges.md Outdated
1. A Comparator Set is a set of 1 or more `Comparators`, joined by 1 or
more space characters. (That is, `' '`, ASCII value 32.)

1. SemVer Version strings with a PRERELEASE version MUST be excluded from a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest waiting until after the Comparator is defined below to specify the interaction with pre release versions, so the reader understands what the set is composed of and how it interacts with the version strings before examining that interaction in greater detail

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I'd considered just saying that it's part of the Comparator definition, but that's not quite accurate.

For example, the Comparator Set { >=1.2.3-a, <2.0.0 } would match 1.2.3-b, even though the Comparator <2.0.0 would not include it on its own. So this logic does properly live in the Comparator Set definition.

Maybe Comparators should just be described before describing Comparator Sets at all?

ranges.md Outdated
primitive ::= ( '<' | '>' | '>=' | '<=' | '=' ) partial
partial ::= xr ( '.' xr ( '.' xr qualifier ? )? )?
xr ::= 'x' | 'X' | '*' | nr
nr ::= '0' | ['1'-'9'] ( ['0'-'9'] ) *
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of nr's current definition, I suggest repeating the same definitions of digit and digits used in the SemVer specification:

<digits> ::= <digit>
           | <digit> <digits>

<digit> ::= "0"
          | <positive digit>

<positive digit> ::= "1" | "2" | "3" | "4" | "5" | "6" | "7" | "8" | "9"

and then specifying that a number is <digit> | <positive digit> <digits>

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good call, it should probably use the same names and bnf style as the SemVer spec.

Also, looking at this more closely, it's not quite accurate. It allows 1.x.2, which isn't a valid range. (Numbers cannot come after the X identifier.)

@isaacs
Copy link
Contributor Author

isaacs commented Jun 25, 2020

Updated to address @DannyS712's comments.

Copy link
Contributor

@DannyS712 DannyS712 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only reviewed the bnf notation


<caret> ::= "^" <partial>

<operator> :: "<" | ">" | ">=" | "<=" | "=" | ""
Copy link
Contributor

@DannyS712 DannyS712 Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without re-reading the above, its confusing why an empty string is a valid operator. Normally I would suggest the opposite, but perhaps define <no operator> as "" and list that as one of the alternatives, instead of "" directly, so there is context

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An empty string is equivalent to using =. For example, depending on some-module@1.2.3 is equivalent to some-module@=1.2.3.

This is important when you have ranges that list individual versions along with other parts. For example, a vulnerability fixed in version 1.2.3, but then regressed in 1.2.4, and fixed properly in 1.2.5, could express the fixed version set as 1.2.3 || >1.2.4.

ranges.md Outdated Show resolved Hide resolved
ranges.md Show resolved Hide resolved

The characters `I`, `J`, `K`, `L`, `M`, and `N` are all presumed to be
positive integers without a leading zero, as allowed by the SemVer MAJOR,
MINOR, and PATCH sections.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the SemVer MAJOR/MINOR/PATCH sections allow for just 0, which isn't technically positive (and since its a zero by itself, arguably its a leading zero?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. I'll update to clarify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 by itself is not a leading zero. To be leading, it must have digits following it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-negative integer, then?

ranges.md Outdated

<comparator set> ::= <hyphen> | <simple set> | ""

<logical or> ::= <optional space> "||" <optional space>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than needing to define an <optional space> rule, suggest:
<logical or> ::= [<space>] "||" [<space>]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to limit the BNF here to the minimal set of BNF grammar used in the Semantic Versions specification. I don't have a strong opinion about that, but I think it would be good to be clear that this is BNF and not EBNF (like the Range grammar in the node-semver module I lifted this from).

ranges.md Outdated

<x identifier> ::= "x" | "X" | "*"

<space> ::= " "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest renaming this to <spaces> since it can be more than one

ranges.md Outdated
<space> ::= " "
| " " <space>

<optional space> ::= "" | <space>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and this to <optional spaces>

ranges.md Outdated
| "y" | "z"
```

SemVer Implementations Supporting SemVer Range Evaluation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think having a list of implementations supporting range evaluation is something that should be included in the specification itself

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that's probably a good point. Maybe we can add a top level thing with a list of implementations and the feature level that they support. I'll take it out of this, though.

@isaacs
Copy link
Contributor Author

isaacs commented Oct 6, 2020

@semver/maintainers Can I get some eyes on this? In particular, does it match the semantics in use in other semver implementations? Or are there changes you'd like to see?

It'd be nice to have an actual spec to point to, rather than just using node-semver as the functional spec for ranges.

@steveklabnik
Copy link
Member

I've been trying to find the time, but it's been hard. It's still on my list, sorry :(

@indirect
Copy link
Member

indirect commented Oct 8, 2020

RubyGems is generally on the same page here, but there are some specific differences:

Prereleases

  1. Any letter, anywhere in the version, causes the version to be a PRERELEASE.
  2. The - character is transformed into .pre. in the canonical, stored version number, as in: Gem::Version.new("1.0.1-1") => Gem::Version.new("1.0.1.pre.1").
  3. The prerelease-mode operators do not stop at the next "final" release number. For instance, if prereleases are being included, then >= 1.0.0.pre.1 could be satisfied by 1.0.1.pre.1. This is the opposite of the behavior described above.

Operators

  1. As far as I can tell, <>= specifiers all work the same.
  2. We don't have ^ or ~, and instead have a single ~> operator. It allows the last given number to increase, for example: ~>2 is equivalent to >=2, while ~>2.0 is equivalent to >=2.0, <3.0, and ~>2.2.2 is equivalent to >=2.2.2, <2.3.0.

Partial Versions
As far as I can tell, they work the same.

Hyphen ranges
Not supported at all.

@Seldaek
Copy link
Contributor

Seldaek commented Oct 8, 2020

Here are some notes from differences I see between the spec and the behavior in Composer.

       For example, the Range `>=1.0.0-alpha` would include the
       Version `1.0.0-beta` but _not_ the Version `1.0.1-beta`.

In Composer, >=1.0.0-alpha includes all versions above, including 1.0.1-beta. We do have a separate way to filter these out though by letting users choose a per-package minimum stability requirement. If I write for example ^2.1 it will resolve to >=2.1.0 <3-dev, but by default exclude all alpha/beta/RCs, if I write ^2.1@alpha it will allow alpha versions within the entire >=2.1.0-alpha <3-dev range. We consider dev to be the lowest stability and support prerelease identifiers using these stabilities + numbers but not just any random string.

Tilde versions ~I.J.K

Composer implemented ~ the same way Bundler's ~> works IIRC, which is to say we allow the last defined digit to be increased. e.g. ~I.J is >=I.J.0 <(I+1).0.0 but ~I.J.K is >=I.J.K <I.(J+1).0.

Partial versions like I / I.J

We treat those as if they were zero-padded, so I is equal to writing I.0.0, if you want to match a subset you have to explicitly write I.* which IMO is more sensible/expected but YMMV :)

Conclusion

I don't think any of these are "fixable", we can not bring this in line with the spec without breaking BC in major ways for all the existing packages/constraints, so overall I don't think it'd be worth it.

Whether this should be included in semver.org I am not sure.. it might lead to confusion as people will expect things to just work if a package maintainer says it supports semver. If it contained a comprehensive explanation of every package manager's quirks and derivations it might be helpful but that's a ton of work to compile.

Maybe more reasonable would be to list it as a recommendation for future implementors and then list to existing implementations' pages explaining how they work, perhaps including a constraint "eval" tool like https://semver.mwl.be/#!?package=symfony%2Fsymfony&version=%5E4.3&minimum-stability=beta

@steveklabnik
Copy link
Member

I don't think any of these are "fixable", we can not bring this in line with the spec without breaking BC in major ways for all the existing packages/constraints, so overall I don't think it'd be worth it.

Personally, my stance here is not that we should fix our implementations, but that we should:

  1. define what we do have in common
  2. enumerate the differences and leave it up to "implementation defined", spec-wise.

This allows for greater interoperability, because you can do things like "the rust semver package can have a flag to get npm semantics so that you can build npm-compatible tools."

@isaacs
Copy link
Contributor Author

isaacs commented Oct 8, 2020

@indirect @Seldaek Thank you for reviewing this! I know it's none of our jobs and we all have more than enough "real" work to do :)

I will dig into the specifics in more detail soon. But I agree with @steveklabnik's comment here:

Personally, my stance here is not that we should fix our implementations, but that we should:

  1. define what we do have in common
  2. enumerate the differences and leave it up to "implementation defined", spec-wise.

I'm not sure it is wise to have implementations referenced too heavily in a specification, but I think there is a way to thread this needle easily enough.

Practically, to accomplish this, I intend to:

  1. Identify what is in common among the major SemVer implementations represented by this group, and outline those as MUST directives in the specification.
  2. Identify each of the differences with MAY (or perhaps SHOULD) directives, with section numbers that are easily referenced.
  3. Where possible, update to say something along the lines of "Implementations MAY support the (whatever) operator or behavior, and if they do, it MUST behave thus and so".

Then over in node-semver or the semver crate, we can say something like "supports 1.4, 2.3, and 6.7 of the optional SemVer Range Specification by default, supports 2.4 and 2.7 if you pass the foobarBaz: true flag", and so on, so at least we're all speaking more or less the same language. (Maybe this declaration of support or lack thereof should be a requirement for calling an implementation spec compliant? "Conformant implementations MUST clearly document their support or lack of support for sections ..." or something?)

At the very least, I think it seems possible for us to avoid situations where we are using the same operator for different semantics.

We don't have ^ or ~, and instead have a single ~> operator. It allows the last given number to increase, for example: ~>2 is equivalent to >=2, while ~>2.0 is equivalent to >=2.0, <3.0, and ~>2.2.2 is equivalent to >=2.2.2, <2.3.0.

I am ashamed to admit this is the first succinct explanation I've ever seen for how ~> works in rubygems. 😊 We used ~ instead (without the >) specifically because it was clearly behaving differently in some cases, and no one who stepped up to write a PR or issue could ever clearly tell me how it was supposed to work. I'll add ~> to the spec separate from ~ (and probably also add it to node-semver when I get the chance).

of the Comparators in the Comparator Set, and one or more of the
following conditions are met:

1. The implementation has provided the user with an option to
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. Now you have to be implementation aware to interpret version ranges. Will that not break tools? I feel this is a pretty huge change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just means that any given range evaluation implementation can still be considered spec-compliant if it has an optional includePrereleases mode, or if it always includes prereleases and is clearly documented as such, but that the default expectation should be that prereleases are not included except in the cases listed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is, the tool is the "implementation" here. If a tool includes prereleases, it has to put that in the readme. If it's optional, and off by default, fine. If it doesn't support it, fine. But it should include prereleases in the cases described, regardless.

@jwdonahue
Copy link
Contributor

jwdonahue commented Jan 23, 2021

Any official range specification for SemVer, should not exclude the use of the full spectrum of mathematical operators. At the very least, standard segment/interval notation should be allowed. I would point out that allowing set/range operators on each field, is more explicit than the prefix operators commonly in use. In fact, you should define those prefix operators, using proper set/range notation.

1.[0,4] | 6.[0,].[,-][,+] Alternatively: 1.[0,5) | 6.[0,][,-][,+]

Says "in the range of major version 1, minor version 0..4 or 6, patch version > 0, with or without prerelease, without or without build meta". Since no tag values are present, the normal precedence rules apply for the prerelease and build meta tags.

1.5.[,][-beta,] Alternate: 1.5.[][-beta,]

Says "in the major range of version 1, minor version 5, any patch level, anything higher than the beta prerelease".

[1.0.0,2.0.0)()[,+] or [1.0.0,2.0.0](-,+]

Shorthand for any major version 1.y.z (2.0.0 is excluded), no prerelease, any build meta tag. The cool thing about this format is you can easily add additional tag types in the future:

[][-,][+,][!,]...

Or any other content that might show up in a version string. In other words, the range notation, should not just be for SemVer.

There are tools in the ecosystem with tens of thousands of users, that you may not be aware of. I was involved in writing an internal tool for the Microsoft Windows build system that used similar syntax.


So where you define prefix operators:

  • < less than, [,x) or [,x[`.
  • <= less than or equal to, [,x].
  • = equals, [x], {x} or just x.
  • > greater than, (x,] or ]x,].
  • >= greater than or equal to, [x,].

Where x is the version triple.

@isaacs
Copy link
Contributor Author

isaacs commented Jan 26, 2021

@jwdonahue That is not a bad idea in isolation, and perhaps even a better idea than what exists in the package management ecosystems today.

However, the goal of this range specification is not to design a maximally expressive syntax for version ranges. The goal is to standardize the way that existing major package managers express version ranges. Those package managers are represented by the team that owns this org and is thankfully too busy to make ecosystem-breaking changes to this spec on a regular basis. 😅

Already, this needs to be pared down quite a bit (as suggested by the comments from @indirect, @steveklabnik, and @Seldaek. We are paving cowpaths, and trying to cut an absolute minimum of new trail.

@jwdonahue
Copy link
Contributor

Well you could at least use standard notations to define how all those adhoc schemes behave.

In a software library ecosystem, it is useful to define a specific range of
versions which is known to satisfy a dependency.

In an ideal world, where all publishers of in the ecosystem are following
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In an ideal world, where all publishers of in the ecosystem are following
In an ideal world, where all publishers in the ecosystem are following

Comment on lines +158 to +162
1. A "tilde version" of the form `~I.J.K` will match any version with a
MAJOR version equal to `I`, a MINOR version equal to `J`, and a
PATCH version equal to or greater than `K`. These are equivalent to
`>=I.J.K <I.J.(K+1)` or `>=I.J.K <I.J.(K+1)-0` if PRERELEASE
versions are being included.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lowering to >=/< shown is inconsistent with how the text defines "tilde version". For example the text suggests that the comparator ~1.2.3 should match the version 1.2.4 because I and J are equal and K is equal or greater, yet >=1.2.3 <1.2.4 does not match 1.2.4.

Suggested change
1. A "tilde version" of the form `~I.J.K` will match any version with a
MAJOR version equal to `I`, a MINOR version equal to `J`, and a
PATCH version equal to or greater than `K`. These are equivalent to
`>=I.J.K <I.J.(K+1)` or `>=I.J.K <I.J.(K+1)-0` if PRERELEASE
versions are being included.
1. A "tilde version" of the form `~I.J.K` will match any version with a
MAJOR version equal to `I`, a MINOR version equal to `J`, and a
PATCH version equal to or greater than `K`. These are equivalent to
`>=I.J.K <I.(J+1).0` or `>=I.J.K <I.(J+1).0-0` if PRERELEASE
versions are being included.

Comment on lines +172 to +173
1. `^0.J.K` will match any version in the range `>=0.J.K
<0.J.(K+1)-0`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is inconsistent with the text defining "caret version".

According to "allows changes that do not increment the first non-zero portion of the SemVer (MAJOR,MINOR,PATCH) tuple", one would expect the comparator ^0.1.2 to match the version 0.1.3 because the first non-zero portion has not been incremented, yet >=0.1.2 <0.1.3-0 does not match 0.1.3.

Suggested change
1. `^0.J.K` will match any version in the range `>=0.J.K
<0.J.(K+1)-0`.
1. `^0.J.K` will match any version in the range `>=0.J.K
<0.(J+1).0-0`.

Comment on lines +175 to +176
1. `^I.J.K` will match any version in the range `>=I.J.K
<I.(J+1).0-0`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also inconsistent with the definition. One would expect ^1.2.3 to match the version 1.3.0 because the first non-zero portion has not been incremented, but >=1.2.3 <1.3.0-0 does not match 1.3.0.

Suggested change
1. `^I.J.K` will match any version in the range `>=I.J.K
<I.(J+1).0-0`.
1. `^I.J.K` will match any version in the range `>=I.J.K
<(I+1).0.0-0`.

greater than the specified portions in the partial version.
Thus:

1. `>I.J` is eqivalent to `>=I.(J+1).0`, or `>=I.(J+1).0-0` if
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
1. `>I.J` is eqivalent to `>=I.(J+1).0`, or `>=I.(J+1).0-0` if
1. `>I.J` is equivalent to `>=I.(J+1).0`, or `>=I.(J+1).0-0` if

(Same typo in many other places)

Comment on lines +154 to +156
1. The empty string, asterisk `*`, capital `X` and lowercase `x` are
equivalent to `>=0.0.0`, or `>=0.0.0-0` if PRERELEASE versions are
being included.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other "<major x>" syntax forms (like *.* and x.*.X) presumably fall into this as well. It would be good to phrase this item in a way that includes them. Otherwise they only exist as syntax in the grammar but with no semantics attached anywhere.

Comment on lines +181 to +183
1. Partial versions are versions of the form, `I` or `I.J` rather than
`I.J.K`, and behave the following impacts when used in SemVer
Ranges.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other syntax forms for partial versions (I.*, I.*.*, I.J.*) should be mentioned somewhere here. Otherwise we've never attached any semantics to that syntax.

Copy link
Member

@steveklabnik steveklabnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took a long time, but here's a first-pass review. Also second the stuff @dtolnay said :)

* `>=` Greater than or equal to
* `=` Equal. If no operator is specified, then equality is assumed,
so this operator is optional, but MAY be included to differentiate it
from a SemVer Version string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one area where Cargo differs from Node, and so I'd prefer this latter part be relegated to elsewhere where we talk about implementation defined things.

@JC3
Copy link

JC3 commented Dec 12, 2021

@JC3

Possibly (I'm not convinced on this one though) define that a leading v is allowed (or at least, should be accepted but should not be used in new applications). Many implementations strip this out, and many don't -- which is a hint that v support is desired and so a consistent definition would at
The spec has been agnostic on prefix and postfix strings since 1.0.0 . It defines only the version syntax, not the text it might be embedded in.

@JC3

Possibly (I'm not convinced on this one though) define that a leading v is allowed (or at least, should be accepted but should not be used in new applications). Many implementations strip this out, and many don't -- which is a hint that v support is desired and so a consistent definition would at least ensure that all implementations behave the same when encountering one.

The spec has been agnostic on prefix and postfix strings since 1.0.0 . It defines only the version syntax, not the text it might be embedded in.

Makes sense thanks!

Just thinking "aloud": Only counterpoint I can think of is that when certain prefixes are typically included without whitespace separation, realistically they tend to become integral parts of regexes in parsers, so they end up being closely tied to parsing. Otoh that's once again countered by the fact that proper semvers start with digits so it's pretty easy to skip nondigit prefixes in a spec-agnostic way.

So... Yep, still makes sense to not mention the v in the spec, ha. 👍 I'll scratch that one off my list.

@Stargateur
Copy link

Stargateur commented May 11, 2022

I think there is a fundamental misconception here. You mix up precedence and compatibility concept of Semver. Npm also do this and Cargo too (Working on fixing it). Actual range operator are >, >=, <, <=, =. While compatibility operator are ^, ~. Range operator should follow precedence rules of Semver there are clearly define in rule 11 with a complete example 1.0.0-alpha < 1.0.0-alpha.1 < 1.0.0-alpha.beta < 1.0.0-beta < 1.0.0-beta.2 < 1.0.0-beta.11 < 1.0.0-rc.1 < 1.0.0. that rules makes total sense to me. it's clearly represent what version is the most recent and what version is the older. That is the very purpose of precedence rules.

Then we have compatibility operator. This MUST be handle completely differently. Semver define clearly what is the considered breaking changes, MAJOR change (MINOR for 0.*.*). The only thing that is not define is compatibility for pre-release.

A pre-release version indicates that the version is unstable and might not satisfy the intended compatibility requirements as denoted by its associated normal version.

This mean we should expect pre-release to have no compatibly except to itself. (In my opinion)

Here you try to make comparator operator be used as compatibility operator, this is a bad way to use them. 99% of user of Semver SHOULD use =, ^ or ~ operator. Use of >, >=, <, <= is very specific, the use case is mostly to allow a range of non compatible version. This mean that even if in theory 1.0.0 and 2.0.0 are not compatible your personal usage of them is not breaking change than if needed you could opt-in to these both version using >= 1.0.0, <= 2.0.0.

You try to exclude pre-release from precedence having them in their own space only create exception rules. We should just make clear that precedence rule are NOT compatibility rules.

@ljharb
Copy link
Contributor

ljharb commented May 11, 2022

It’s not just about that; you never accidentally want prereleases included.

@Stargateur
Copy link

Stargateur commented May 11, 2022

@ljharb you didn't since you used range operator that SHOULD follow precedence rule, so you explicitly ask for it (these rules are clearly in Semver specification as I said in my previous comment). If you want a range operator and compatibility, you should create new operator, not break math operator and contradict logic. In my option ^ is enough (thus I'm working to clearly state that pre-release should never ever have any compatible version other than themself). You could add some new operator that exclude pre-release and have the same logic than range operator. But change actual math operator is clearly a bad idea, I think teach users that use >=, >, <= or < should be avoided (unless a very good reason) and prefer ^ that let a resolver take the more suitable compatible version is WAY better than introduce exception in precedence rules cause people miss use range operator in npm (or anywhere else).

@ljharb
Copy link
Contributor

ljharb commented May 11, 2022

What’s in the spec’s non-normative prose isn’t quite as important as the expectations of the massive ecosystems using these range operators with semver and prereleases.

We should be teaching users that prereleases may only be used explicitly, and never by accident - otherwise, prereleases serve no purpose, since they’re full releases - which is what this PR does.

@Stargateur
Copy link

Stargateur commented May 11, 2022

What’s in the spec’s non-normative prose isn’t quite as important as the expectations of the massive ecosystems using these range operators with semver and prereleases.

That not an argument. Before Semver there was plenty of versioning system, after semver there are still a lot of versioning system (but less I think ?). Here contrary to fix improper usage of range operator in npm project (or elsewhere) you try to make it a standard. You try to impose rules based on npm usage without take into consideration what is the best for the future. This can only lead to what people always do "I will have my own versioning version".

As my last point, Semver very first and clear purpose was "a simple set of rules", it's what make the success of Semver, this proposition is not simple at all, it introduces exception in precedence rule and in my book exception are generally a bad thing.

@ljharb
Copy link
Contributor

ljharb commented May 11, 2022

This usage of >= etc is quite proper.

@Stargateur
Copy link

Stargateur commented May 11, 2022

I explain why not in my first comment #584 (comment):

99% of user of Semver SHOULD use =, ^ or ~ operator. Use of >, >=, <, <= is very specific, the use case is mostly to allow a range of non compatible version. This mean that even if in theory 1.0.0 and 2.0.0 are not compatible your personal usage of them is not breaking change than if needed you could opt-in to these both version using >= 1.0.0, <= 2.0.0.

Please give arguments do not just say what you think. I take a lot of my time to answer you, I'm not a native English speaker. I will not answer any more if you just contradict without argument. please add at least some example of what you consider proper usage or range operator. (or improper, say otherwise what usage of range operator is not handle by compatibility operator)

@isaacs
Copy link
Contributor Author

isaacs commented May 11, 2022

The argument for the semantics proposed in this spec, which npm and cargo have used for many years with generally good results, is this: Semver ranges are only about dependency resolution. As such, even the "math" style operators (>, >=, etc) are designed to best serve the needs of a user who is resolving a dependency.

Philosophically, it is wrong to attempt to talk about the correctness of the "semantics" of a grammar or language apart from the community of people who are actually using that language to communicate. You say that it's wrong for >1.0.0 to not match 1.0.1-alpha.9 by default. But why is it wrong? Semver range operators are not strictly about precedence; they are about compatibility. They are not governed by the rules of mathematics, but by the rules of human language.

If we were to say that only the non-math range operators exclude prereleases, who is served? Certainly not the user depending on >=1.0.0 <1.2.0, who will be very surprised to have their dependency satisfied by an incompatible 1.2.0-alpha.0.

If we say that any range with a prerelease must be specified exactly, or NEVER match, well, this is inconvenient and counter to the aims of a beta tester opting in to a rapidly updating prerelease in active development, who depends on ^1.0.0-beta.2 and can't easily pull in the fixes that land on 1.0.0-beta.7.

If you are going to say that one is more "correct" than the other, you should be prepared to show that the more "correct" behavior is closer to the intent expressed by those who would use the range in that way. That is what "correct" means in a context such as this.

This specification text lacks this explanation, which I think is a shortcoming, and I can see how, lacking that explanation, this would not be clear. It should more clearly describe the aims and reasoning. Thank you for highlighting the issue.

@Stargateur
Copy link

Stargateur commented May 11, 2022

Semver range operators are not strictly about precedence; they are about compatibility

Here we disagree, <, <=, >, >= are universally see as math comparative operator. Also, semver clearly use < symbol to define example for precedence rule. A new user of semver could clearly expect that comparative operator are for ordering. Why example say A < B but I write > A but it doesn't match B ?!? It's counter intuitive.

If we were to say that only the non-math range operators exclude prereleases, who is served? Certainly not the user depending on >=1.0.0 <1.2.0, who will be very surprised to have their dependency satisfied by an incompatible 1.2.0-alpha.0.

First >=1.0.0 <1.2.0 should be ^1.0.0 <1.2.0 problem solve.

I totally agree, comparative operator would pretty much be useless, and I don't see it as a problem, there should not be used. There are very bad to define compatible dep requirement, there are a hack to semver compatibility rules. That why you have a lot of trouble with them.

You should propose a better solution for this, I have work a lot to think of a better solution, so far I think an ok solution is to use or feature, let's say I want version 1 or 2 of something cause I can handle they both, I would write ^1 || ^2. This would say "take the better that are compatible either with version 1 or version 2. This would naturally avoid any pre-release and is totally explicit.

The problem of this solution is that if you want handle 50 versions you will need to do ^1 || ^2 || ^3 || ^4... but I think it's very unlikely that such requirement exist in practice. Being able to handle two major release is already rare enough, I would be surprise that you find me a real case where something accept more than 2 or 3 major release range (and I would probably argue that there is a problem somewhere). If range is REALLY a feature required I would add a new operator something like |> 1 that could mean "take only stable release superior to 1" but as I said I think it's overkill. Specially with complex rules you describe in your RFC.

We could advice to write >1, <3.0.0-0 that would still include new prelease from version 2 but that very unlikely to happen, that not a perfect solution. We could also add a NO-PRE-RELEASE operator like >1, <3, NO-PRE-RELEASE this is at least very explicit and clear.

If we say that any range with a prerelease must be specified exactly, or NEVER match, well, this is inconvenient and counter to the aims of a beta tester opting in to a rapidly updating prerelease in active development, who depends on ^1.0.0-beta.2 and can't easily pull in the fixes that land on 1.0.0-beta.7.

This is exactly where range could be useful, when compatibility is unclear and the user want to define by hand >1.0.0-beta.2, <1.0.0-gamma but if you want user to have complex pre-release versioning, I think you should better considerate defining compatibly rules between pre-release. But again personally I think it's overkill. Pre-release are snapshot preview, it's strange to expect them to be bug free or stable, so having rule for pre-release to be able to receive fix is hope for too much and also push pre-release to something way higher than most maintainers want. Some pre-release receive compatible update but I think it's still better to explicitly jump to them for an user. The precedence rule help user to see that a pre-release have been updated.

If you are going to say that one is more "correct" than the other, you should be prepared to show that the more "correct" behavior is closer to the intent expressed by those who would use the range in that way. That is what "correct" means in a context such as this.

Agree, my message are opinionated, I'm sorry, I try to include fact and stay as objective as possible. I may have totally fail.

This specification text lacks this explanation, which I think is a shortcoming, and I can see how, lacking that explanation, this would not be clear. It should more clearly describe the aims and reasoning. Thank you for highlighting the issue.

Happy if I could have help, I will so also say that "I, J, K, L, M, and N" make my motivation to read this RFC go down very quickly, I suggest better naming such as MAJOR MINOR PATCH REQ_MAJOR REQ_MINOR REQ_PATCH or something that help better understanding.

@isaacs
Copy link
Contributor Author

isaacs commented May 11, 2022

Here we disagree, <, <=, >, >= are universally see as math comparative operator.

I am sorry, your argument is invalid. They are used as range specifiers today, and have been for over a decade, as a central component of software ecosystems used by tens of millions of people.

You can make the case that it would be better if they were seen as math comparisons, but they are self-evidently not universally seen that way. (Even by people who claim to see them that way! Their demonstrated behavior shows that they often have different expectations than they think they have!)

I totally agree, comparative operator would pretty much be useless, and I don't see it as a problem, there should not be used.

So you say they should not be used that way. I don't agree or disagree, because I do not care much about "should". The fact is they are used that way.

@Stargateur
Copy link

Stargateur commented May 12, 2022

I am sorry, your argument is invalid. They are used as range specifiers today, and have been for over a decade, as a central component of software ecosystems used by tens of millions of people.

I'm will fall back to my previous comment:

That not an argument. Before Semver there was plenty of versioning system, after semver there are still a lot of versioning system (but less I think ?). Here contrary to fix improper usage of range operator in npm project (or elsewhere) you try to make it a standard. You try to impose rules based on npm usage without take into consideration what is the best for the future. This can only lead to what people always do "I will have my own versioning version".

There are still other versioning system in the world. Semver is not the only thing used, and I would argue that npm doesn't use semver but a custom one (as many other tool). That not a problem but here you clearly assume this is universal. You assume that since "most" use it that way so we should make it official in Semver. Say otherwise, since npm did this and npm have millions of users than all peoples that use semver should follow rules create by npm. You have a circular argument that you use to justify your choice to push this particular solution.

#584 (comment) clearly show this, every tools have its rules, a RFC like semver mean to be universal and simple. I think that you should make a "NPM semver" instead of trying to impose npm rules on semver. Cause Semver is mean to be simple. What you propose here is complicated. It's very hard to understand everything for example according to what dtonay said to me >1.0.0-alpha, <1.0.0 match 1.0.0-beta but >1.0.0-alpha, <1.0 is not, same for >1.0.0-alpha, <1. Why ? Thus I'm not sure it's correct cause even after read this RFC twice I still don't fully understand the rules you want to put in place. It's unclear to me, but I don't need to understand fully cause my main point is that it's hard to understand theses rules.

Also, again you never answer all my points I made a lot of point but you always talk about "that the way it is" or "we do this that way period" but you never explain why ^1.0.0 <1.2.0 shouldn't be use in place of >=1.0.0 <1.2.0 for example. I propose alternative solution but you dismiss them without even take them into consideration.

Finally, I totally agree that pre-release are a trap with >= & co operator, I just don't agree that we should bend the rules about them to make them "work". I think the better solution is elsewhere. I will propose my solutions in a Rust RFC that I think will be nice for Rust world. Thus I don't know if they will be nice for everyone and I don't even know if it will be accepted. But I will so not make any attempt to impose it to semver. Cause solution for Cargo could not be applied to npm or other tools, Cargo can help a lot to fix this problem and I intend to use it to warn user about comparator operator. Also, Rust ecosystem almost never use range operator and we have way to introduce backward compatibility to Cargo.

@Stargateur
Copy link

Stargateur commented May 12, 2022

I may I changed opinion about excluding pre-release of range operator as being broken maths it was maybe too pedantic. But I think it's should be simple. Thus I think doing something like range operator >, >=, <=, < that never include any pre-release and >>, >>=, <<=, << that always include any pre-release. This would make this proposition way simpler.

@isaacs
Copy link
Contributor Author

isaacs commented May 12, 2022

@Stargateur Adding a new operator for "greater/less than (including all prereleases)", is an interesting idea, and would not conflict with anything in current usage that I'm aware of. Especially, it would be convenient in cases where a version range might wish to include a set of prereleases across several version tuples, but otherwise omit prereleases. For example, 1 >>1.2.3-beta.9 || ^2.3.4 to indicate that you want to get all prereleases in the 1.x line, but only proper releases in the 2.x line starting with 2.3.4.

However, how does "> < >= <= never matches any prerelease" work when you have something like >=1.0.0-alpha.2 or ^1.0.0-alpha.2? The current behavior captures user intent fairly well, which is: "I am ok with the prereleases of 1.0.0 starting with 1.0.0-alpha.2, but I do not want prereleases of any other version, as those may be unstable in a way I don't expect".

There are still valid use cases where it is important to match any and all prerelease versions, however. Consider a case where a security advisory about a bug that exists in v0, v1, and v2 of a module, and is fixed in v1.3.2 and v2.4.3. It would include a vulnerable_versions field of: "<1.3.2 || 2 <2.4.3". In this case, it is very important that 1.2.7-alpha.32 or even 1.3.2-beta.3 would match it. In such a case, yes, it could be written as <<1.3.2 || 2 <<2.4.3, but the option to implement a "prerelease mode" where all prereleases are treated normally by default, is extremely useful, and makes disastrous mistakes much less easy to fall on.

note about status of this spec

The SemVer specification is somewhat stalled by inertia, which is not a bad thing, for reasons I will describe. Because the main semver specification only concerns itself with versions, every dependency community has come to invent their own dialect for identifying dependency ranges.

As noted in some of the feedback comments, some do not have any concept of dependency ranges, specifying just a minimum semver, and effectively always prepending ^ to it. Others have some limited subset, or treat >= as how ^ is specified here. Still others support only MIN - MAX syntax, or some other subset of this proposal.

Changing the dialect that a language community uses for dependency identification is extremely disruptive. It is, in many cases, the most disruptive thing possible, because dependencies that could previously be installed no longer can. Builds break, people get upset, and the people affected by the change are often not in a position to fix it.

For example, imagine a language community where name@MAJOR.MINOR.PATCH is treated as >=MAJOR.MINOR.PATCH <(MAJOR+1).0.0. I depend on foo, and foo depends on bar, and bar depends on baz@4.3.2. However, baz@4.3.2 has a critical bug or malware injection, so the maintainers published it with baz@4.3.3. The bar maintainer is long gone. We say "oh great, there's a spec now! let's use it!", so now my build will fetch baz@4.3.2, which is gone and my build breaks (or worse, it's not gone and my build doesn't break, but I'm shipping malware!)

Back in early 2015, when npm made the change to omit prerelease versions from range matching by default, it caused a lot of problems for a lot of people, even though everyone could recognize it was an unambiguously good idea. There were quite a few packages that would (for example) act as a proxy to some other module, and be published with a version like (internal binary version)-(wrapper version), so you'd see things like 1.4.3-2.0.1 which was intended as a proper release. Others used the prerelease as a less restrictive form of versioning, so they'd have publishes like 0.0.0-1.4.2.8.build.2014-05-02T14-22-43-321.

While these can all be argued to be "abuses of semver", in fact, we had to make a judgement call and choose the option that harmed the fewest people, while providing an upgrade path that made the transition as easy as possible for anyone affected.

If rubygems or NuGet or crates decided to adopt this specification as-is, it would probably not be a good idea. I merely documented what npm has done, because it's what I'm most familiar with, and it is at least an option that has had some robust testing in practice. But this was always intended as a starting point for a conversation to find the version range grammar that is the overlap of what major language communities are using. As it happens, there is not much overlap, so I don't have high hopes of this landing. It might just be best for each platform to endeavor to fully document what they do, so that users can at least be informed.

@Stargateur
Copy link

Thank a lot for the detailed explanation. I totally agree with all you said. This PR help me a lot to understand the problem with range and pre-release. And your last message confirm my feeling and give me a very important warning about security concern. I will definitely incorporate this warning into security concern part of my work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
consensus seeking The discussion is not over yet extend Brand new ideas/rules to add to the specification RFC Request for comments state for next version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet