-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Warn on polymorphic comparisons #9928
base: trunk
Are you sure you want to change the base?
Conversation
Also does anybody have any pointer on the compiler failure? I spent almost 2 hours trying to make it compile unsuccessfully (removing
EDIT: @dra27 helped me found the issue. A large number of modules use |
ba1dfbe
to
3376507
Compare
Cannot we directly use alerts instead? |
3376507
to
47f27a4
Compare
Mh, interesting. I hadn't realized this could be used like this while looking at the code for alerts. Indeed this could be used instead, I'll look into it sooner rather than later. |
@alainfrisch Does that sound better with f5a4024? The only downside I can see so far is that it's not going to allow people to use it out-of-the-box if they want to keep backward compatibility with OCaml < 4.08 and use |
@kit-ty-kate: Thanks for this work 🙂 Please forgive the drive-by bikeshedding, but I'm wary of the term "generic comparison" to describe val equal : 'a typ -> 'a -> 'a -> bool (These are distinct from parametrically-polymorphic functions because they are not restricted to behaving uniformly on all types.) See for instance, @yallop's writings on the subject. Perhaps the feature could be named |
I used to call them polymorphic comparison operators/functions in kit-ty-kate/ocaml-monomorphic, so Now that I think of it maybe I'm really not sure what's the best choice here. I was kind of waiting for some better propositions from other people here (you're the first one!) as well as for a better alert message, maybe something more discretive as well. |
I don't think it would be. These functions are known as being polymorphic equality/comparison. Searching the interwebs for that will lead you to the controversies that explain you why it's good or bad. I think using another name would be counter productive for the end user. |
FWIW, the manual uses the term "structural equality":
I'd default to using the same terminology. I would hesitate to use "polymorphic" since I think it usually is implicitly understood as parametric polymorphic, which is not technically applicable in this case. I have no objection to "generic" as it tends to indicate "not specialized" to me, which is applicable in this case. |
"Generic" has been widely used since the 1980's to mean both parametric polymorphism and ad-hoc polymorphism. ADA had "generics"; C++ templates are "generic"; and Cardelli and Wegner's 1985 survey On Understanding Types, Data Abstraction, and Polymorphism writes
In this broad sense, the use of "generic" in the warning discussed here is perfectly appropriate. The narrower meaning of "generic" as "taking a run-time type description / function dictionary as argument" is more recent and not incompatible, just narrower. That doesn't make it the only meaning. I would avoid "dynamic comparison", because most comparisons are "dynamic" (take place at run-time rather than compile-time). "Polymorphic comparisons" is OK, I guess. "Structural comparison" is by opposition to "pointer comparison", a.k.a. |
@xavierleroy: In my defense – and without wishing to be contrarian – in a field in which much terminology has at some point been overloaded by one paper or another, I think it's worth aiming for more than being correct in a "broad sense", particularly given the wide selection of terms to choose from. Uses of "generic" in the sense of run-time type representations often make an explicit effort to differentiate their usage from the genericity implied by parametric polymorphism. This is especially true in Haskell-oriented literature (e.g. SYB and its descendants) and libraries (e.g. GHC.Generics). In this context, the term "generic" is incompatible with parametric polymorphism and is used for precisely this reason; it creates a meaningful distinction, admittedly using words that are overlapping in their non-technical meanings. While I agree with you that there are other interpretations of the term "generic" that make it a superset of "parametrically polymorphic", I think it is precisely the more narrow meaning of "polymorphic" that is pertinent w.r.t. incorrect use of |
I am more used to "polymorphic comparison operator" than "generic comparison operator". (But I don't think either clearly express the fact that the operator is non-parametric.) The reason why we disallow them is that the notion of equality obtained is imprecise, approximate, compared to the "real" equality predicate that the user has in mind. Maybe we could also say that the parametric |
I'd call these 'representational' comparisons, because they operate on the representation of values rather than on the type structure:
I think that 'representational' makes it clear both why these operations are useful (because they don't need to be defined afresh for each type) and why they come with a warning (because they don't respect type structure or abstraction boundaries). |
I renamed to EDIT: updated the main description of this PR above and its title |
I note that you didn't carry over the compiler change from #1642, which would silence the alert for the built-in operations whenever they are specialized (i.e. replaced by non-generic operations at compile time). Are there any plans to do it? Is there any consensus on this point? |
It might be simpler to start without this specialization refinement? That would require the use |
d115652
to
312ad56
Compare
I've rebased the PR. CI is green and this is ready to review. I've also removed the numerous changes to Makefiles to disable the new alert in favour of regenerating |
I agree, let's start with the simple thing and see how it fares. We can make it more complex later if we decide it's worth the work. |
parsing/builtin_attributes.ml
Outdated
@@ -92,6 +92,16 @@ let alert_attr x = | |||
Some (x, "deprecated", string_of_opt_payload x.attr_payload) | |||
| "ocaml.alert"|"alert" -> | |||
begin match kind_and_message x.attr_payload with | |||
| Some ("polymorphic_comparison" as kind, message) -> | |||
let message = match message with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it really worth special casing this new alert here? I guess this is to avoid duplicating the more verbose message everywhere. But someone explicitly enabling the alert would know what it means, just seeing its name, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think it is worth the special case here to avoid duplication.
But someone explicitly enabling the alert would know what it means, just seeing its name, no?
To me, the rational is that if this is enabled in a medium-size or bigger project with several people and someone is contributing for example, they might not be familiar with what the alert means and might be discouraged by only seeing
File "alert_polymorphic_comparison.ml", line 12, characters 9-16:
12 | let _a = compare
^^^^^^^
Alert polymorphic_comparison: Stdlib.compare
for example, if it there was no explanation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair enough. What about only replacing an empty payload with the full message? If one later introduce functions parametrized over a comparison function, one might want to provide explicit messages to the "legacy" version, and adding the generic message would feel weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not completely convinced by the argument for the special casing. Copying the alert message is certainly boring when writing the PR, but this is an one time issue. I would rather keep the alert as vanilla as possible. If this ends up creating too much maintenance pain, it is probably a sign that we should improve the interface for all alerts (and not only the polymorphic_comparison
one).
|
For comparison functions, the solution used by |
312ad56
to
ad23873
Compare
…of polymorphic comparison functions
ad23873
to
194628b
Compare
Thanks a lot, I missed these ones. I've tagged the functions that use polymorphic compare. I hope I haven't missed any.
I agree, I also said that in the PR description. This PR is still useful as-is and could get the ball rolling for people who want to experiment with it, so we can find where the best place or name is to add those new functions/operators.
yes that would be ideal. For the moment people usually default their implementation of the comparison operators to int. This is the case in
I've updated the PR to not have any special case. I wasn't sure about the default message to use, so I didn't put anything. This can always be improved in the future, but just in case this PR can be merged before the 4.13 branching I prefer not to have to make more choices to avoid points of contention. I'm not sure what's wrong with the hygiene script, I've added the alert to all the functions in MoreLabels too but it seems to still complain. Does anyone have any clue? |
let current = | ||
ref | ||
{ | ||
active = Array.make (last_warning_number + 1) true; | ||
error = Array.make (last_warning_number + 1) false; | ||
alerts = (Misc.Stdlib.String.Set.empty, false); (* all enabled *) | ||
alerts = (disabled_alerts, false); (* all enabled expt disabled_alerts *) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expt disabled_alerts
⇒ except [disabled_alerts]
?
@@ -88,29 +88,35 @@ val add : ('a, 'b) t -> 'a -> 'b -> unit | |||
(Same behavior as with association lists.) *) | |||
|
|||
val find : ('a, 'b) t -> 'a -> 'b | |||
[@@ocaml.alert polymorphic_comparison] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The source file for this interface is stdlib/templates/hashtbl.template.mli
.
My concern is that the PR would leave the stdlib in a state which is not really usable when the alert is enabled. People who use a library that shadows the the comparison operator in order to restrict it to type
I see the argument, but I'm not sure people will actually spend time rewriting significant code bases to avoid the marked functions until we provide the "official" target solution (and I'm not sure we want people to create their own solutions, leading to more style fragmentation across the community). In #1642, there was the idea of allowing some uses of the existing polymorphic functions, with various variants of that idea. I was not personally a big fan, but there seemed to be a rather wide support for the idea, meaning that a lot of code would not need to be rewritten at all (and people knowing that this might happen will not enable the alert). I can see why people are happy continuing writing |
In ReScript(was BuckleScript), we had a similar warning, but it only warns when the type specialisation does not happen, so that we can still use "=" for most cases, it seems less intrusive than this proposal |
Would one of the upstream dev be interested in doing an audio call to get this over the line before 5.01 ? I’ve tried several times to integrate the ideas of #1642 to this PR but I don’t know enough about the compiler internals to get it to work and I need some guidance. |
It seems I'm not as much against polymorphic comparison as most people tend to be, but there is one thing I would like to be done, which that the unqualified use of Of course that entails that I just lost a few hours tracking down a regression in a non-trivial computational geometry algorithm I'm implementing because of the following. I had that: module M = struct
let compare v0 v1 = … (* M.t specific [compare] *)
let f … = … (* uses [compare] *)
end Now that I'm pretty sure it's not the first time this happens to me. |
I like @dbuenzli 's idea! After all, polymorphic operators are magic things living in the runtime, |
This of course is not on par with what this PR is trying to propose. But given the I think we should remove its use by basic identifier deprecation (this PR wouldn't help since the proposed warning would be disabled by default). Nothing more is needed for this except to introduce |
I’m not convinced that If you think the name is bad I’m not overly attached to it but i strongly think it should be moved in a module containing only those functions and nothing else, if we’re going this way. |
@kit-ty-kate how about Sys.Poly ? Sys is already a bit of a natural place for magical functions (such as |
For the record: I do not support the proposed deprecation of generic comparisons. But @kit-ty-kate is right that |
Note that I'm not proposing to deprecate generic comparisons. The only thing I'm proposing to deprecate is the bare
I'd say there's already more than that there To echo @c-cube, |
|
This is a new take on #1642 which is a PR @dra27 pointed me towards after explaining a similar idea I had over the holiday.
This PR adds a new builtin alert
[@@ocaml.alert polymorphic_comparison]
and tag polymorphic comparison functions and operators in the standard library with it. This attribute, much like[@@deprecated]
or similar attribute will display a warning when the function tagged is used.This warning is disabled by default and it won't be enabled by default in dune either.
Incentives
Generic comparison functions are great but they are:
Future
In an ideal world, this PR should be the first of four parts:
Infix
modules to some of the modules of the standard library. Such modules (e.g.Int
,Float
, …) would have(=)
,(>=)
and so on.[%ocaml.compare t]
to create comparison functions for big and supposedly straightforward types such asUnix.error
or similar. (I'm not totally sold on this idea yet myself)Regardless of the future plans, I personally think this PR can be merged on its own.