New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to name existentials in pattern-matching #9584
Conversation
951cf8b
to
6cb3500
Compare
@trefis pinged me about this PR just now. I like the proposed syntax, much better than the one in #9579. There is broad consensus in #7074 that this syntax is acceptable (@yallop, @lpw25 and myself supported this syntax, @craigfe had solid arguments against the other proposal, etc.). I remain of the opinion that it should ideally be possible to have something "more local" in the annotating-the-argument parts, for example (I'm only semi-happy with the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. A few small things to improve, but the code looks correct to me. Well done everyone involved for finding something we can all agree to.
6cb3500
to
0374bc0
Compare
0374bc0
to
df5df3e
Compare
2718514
to
9f6cf10
Compare
After many weeks and two rebases, I think this PR is now ready for inclusion. |
typing/typedtree.mli
Outdated
@@ -89,7 +89,7 @@ and 'k pattern_desc = | |||
*) | |||
| Tpat_construct : | |||
Longident.t loc * Types.constructor_description * | |||
value general_pattern list -> | |||
value general_pattern list * (Ident.t loc list * core_type) option -> | |||
value pattern_desc | |||
(** C [] | |||
C P [P] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment here should be updated, just like in parsetree.mli (thanks for that change, by the way).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spotted a couple of small things to change and maybe a small bug. Good to go once they are addressed.
Note: I think that the feature is very useful/important and that the design we ended up with is good. Thanks a lot @garrigue for this work. |
parsing/parsetree.mli
Outdated
(* C None | ||
C P Some (P, None) | ||
C (P1, ..., Pn) Some (Ppat_tuple [P1; ...; Pn], None) | ||
C (type a b) (P : T) Some (P, Some ([a; b], T)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm late in the discussion, but shouldn't we encode a regular Ppat_constraint to keep the core_type, i.e. just use:
Ppat_construct of loc * (pattern * string loc list) option
(with an empty list when there is no (type ...)
) Requiring to have the inner pattern be a Ppat_constraint when the list is not empty is ok, imo.
I'm concerned that the current representation changes the way an existing constraint would be represented when one adds a type binding (C (P : T)
--> C (type a) (P : T')
).
Can the type-checked behave differently on C (type a) (P : T)
and C (type a) ((P : T) : T)
?
Btw, is it an error to bind type names which are not used in the pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about suggesting such nesting, but:
- Currently the annotation is interpreted in the context of the type variables in a very specific way, so it makes sense to keep the two pieces together at the constructor site.
- Even with the representation you propose, my understanding is that the type-checker would have to lookup the
Tpat_constraint
node and use its payload specifically (not go through the usualTpat_constraint
path in this case), so you would not get the same-behavior guarantees that you are asking about.
The handling of existential variables is very specific because, morally, their actual binding is computed by unifying the annotation with the constructor return type, and then making them rigid for the rest of the type-checking. This is not the behavior you would get if you checked the whole subpattern before rigidifying (I think that behavior would make sense and we discussed it earlier, but @garrigue says it is harder to implement), nor of course the behavior you would get by rigidifying before checking the annotation or using flexible variables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: the previous discussion I was referring to is starting (approximatively) at #7074 (comment) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @gasche for the pointer.
-
If the type-checking is not an extension of what happens for
C (P : T)
, reusing the same concrete syntax can really be misleading, no? -
Independently of what the type checker does, having different representation for the T in
C (type a) (P : T)
and inC (P : T)
is confusing for people writing syntactic tools. I'm not shocked that, short of having a cleaner solution, the type-checker does something special with Ppat_constraint on top of the pattern when existential type names are bound. We might be able to change the behavior of the type-checker later, while keeping the same syntax and Parsetree representation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the type-checking is not an extension of what happens for C (P : T), reusing the same concrete syntax can really be misleading, no?
I think that the two readings of C (P : T)
(as a constraint, and as a return-type-annotated-constructor pattern with no existential variables bound) do have the same meaning. What I meant is that if we think of the type-checking of a pattern P
as generating an inference constraint (or some other "result", for example in an non-inferred world taking an scrutinee type and returning types for the bound variables), then the result of type-checking of C (type a...) (P : T)
cannot be computed in a modular lay, as an action on the result of the type-checking of (P : T)
.
Independently of what the type checker does, having different representation for the T in C (type a) (P : T) and in C (P : T) is confusing for people writing syntactic tools.
This is a good point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(When I say "merge this one", I'm not actually sure that all pending questions have been resolved, this is for @lpw25 to confirm. So it may not be the unique blocker yet.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with either the current version or the version where the check is completely moved to the type-checker and the parser is happy to parse patterns with existential binders and no type constraints. I dislike any version where what the type can represent and what the parser can produce are not the same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just realize that it is wrong to say that
C (type a) (pats : ty)
is equivalent to
C (pats : ty)
when a
occurs nowhere: if C
has arity 2 or more, pats
must syntactically be a list of patterns, and the latter would not be typable. So, sorry, this is really a new syntax.
@gasche indeed suggested that it would be nicer to allow type constraints on each argument, but I answered that it would be difficult to implement it cleanly.
I will still give it a try, just to see how it would work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, IIUC, for a pattern with 2 arguments:
C _
is allowed butC (type a) _
is rejected(?).C (type a) ((x, y) : t)
can be allowed (for a correct type) (?) butC ((x, y) : t)
is always rejected.
Is that right? Are there other differences?
The situation with n-ary arguments was already quite confusing, I fear this is getting even worse...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not quite correct.
C _
is allowed, butC (type a) _
is rejected by the type checker (not the parser) independently of the number of arguments. The(type a)
notation inside patterns is restricted to existential variables, and requires to bind all of them, which you cannot do without a type annotation, hence the rejection.- Annotations are always allowed for multiary types (the internal parenthesis are optional).
type ('a, 'b) pair = Pair of 'a * 'b;;
type t = int * bool;;
let f = function Pair ((x,y) : t) -> x;;
val f : (int, bool) pair -> int = <fun>
type closure = Clos : ('a -> int) * 'a -> closure;;
type 'a app = ('a -> int) * 'a;;
let f = function Clos (type a) ((x,y) : a app) -> x y;;
val f : closure -> int = <fun>
The result looks more regular to me.
You replied "done" to a few remarks, but I don't see the changes. A forgotten push maybe? |
No, this should be there. They are marked as outdated, so the code should have changed. |
I have create another PR implementing(?) @alainfrisch 's idea (#10180). |
dac29a4
to
45d7cc1
Compare
I have merged most of the suggestions here and in #10180 and rebased.
|
I'm getting a compilation error in the flambda backend, but I don't see how this can be related to this PR. |
Note also that neither of these approaches work for GADT constructors with an inline record. |
I'm really confused about what is happening. Apparently the bug comes from this PR, but the error message is:
Very confusing. |
I could eventually fix the problem by being more conservative, but there is a mysterious interaction at work here. |
OK, I found the reason: I had forgotten to reinstate processing of the annotation I'm still keeping the conservative approach for already valid annotations, as removing a |
The typed AST mapper is used for iterating over all the environments in the AST and calling [Env.keep_only_summary] on them before outputing the cmt file. |
Thanks for the explanation, this now makes sense. Is everybody happy with this last iteration of the PR? |
This is minor, but am I right to think that there is no support for naming existentially quantified row variable: type t = X: [> `A | `B ] -> t
let f (X (type x) (v:??)) = () ? |
7dfd2b8
to
aa12a07
Compare
Thanks to everybody in the discussion, and good annotating! |
The directory cc @Octachron |
Thanks for the notice! This should be fixed in trunk. @garrigue , to be safe, on the manual side, was there only a new paragraph in the GADT section? |
Yes, that’s all I added. |
Thanks for the confirmation. Indeed rebase across repertory moves tend to be painful, sorry for the noise! |
…t-principality-and-gadts Update a testcase in principality-and-gadts.ml to reflect changes in ocaml#9584
Patterns without named existentials are not correctly constructed. Issue introduced in ocaml#9584
…mar13/fix-typing-gadts-test-principality-and-gadts Update a testcase in principality-and-gadts.ml to reflect changes in ocaml#9584
This attempts to implement a solution suggested by @yallop in #7074 to the problem of naming existentials in pattern-matching.
For the sake of simplicity(?) the syntax is:
I.e. the type is given as a tuple rather than on individual arguments.
Ellipses are allowed, but all existentials must be matched.