-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use Mp_present for module aliases #10673
base: trunk
Are you sure you want to change the base?
Conversation
Isn't this a show-stopper? The initial motivation for module aliases was precisely to avoid forcing linking of compilation units that would otherwise be unreferenced. I'm confused. |
@xavierleroy I understood the same, and have a version of this PR described here that only makes modules I don't have the data to support a decision either way. @lpw25 in your reply on #10612 it sounded like you might have some numbers on the tradeoffs we'd expect in executable size/runtime cost here vs compilation time with the alternative? |
I was too, but if I understand this correctly the "old" module alias behaviour is unchanged when using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Your decision to restrict the absent aliases to global identifiers is interesting, and probably a good idea. However, I think that it is a little too restrictive in practice. A common practice is to have:
(* a.ml *)
module Bar = Foo__Bar
and:
(* b.ml *)
open! A
module Bar = Bar
where using B
is not supposed to cause Foo__Bar
to get linked unless you actually use B.Bar
. With this patch A.Bar
would be absent, since Foo__Bar
is a global identifier, but B.Bar
would be present since A.Bar
isn't global.
Additionally, if you are going to have a restriction based on the destination of the alias, then you can actually allow absent_globals
to be true
inside the body of functors. The restriction from the body of functors is only needed if you allow all aliases to be absent.
I think there are two reasonable approaches:
- Allow absent aliases for all paths when
absent_globals
istrue
. Don't allow absent aliases in functor bodies or module types. - Allow absent aliases for paths which normalize to a global identifier. Allow absent aliases in functor bodies. Don't allow absent aliases in module types.
1 was what I was proposing in my previous comments, but I think 2 is probably the better design. I think the restriction will be less surprising for users, and it will work better if we add an explicit syntax for absent aliases.
typing/includemod.ml
Outdated
Ok Tcoerce_none | ||
else | ||
(* This path contains local references, we may need to build a | ||
coercion. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interestingly, I think we could avoid this problem if we had transparent ascription. The underlying issue is that given:
struct module A : S module B = A end
: sig module A' : S' module B' = A' end
(where '
is just used to distinguish the two sides) we produce a substitution like:
A' :-> A
but the correct substitution is really:
A' :-> (A <: S')
which would automatically give the right coercion in these cases.
Once we have transparent ascription that may be a better approach, although it requires some care around recursive modules (S'
might refer to A'
, but not in aliases, so those needs to be substituted out first using the current substitution) and also some care to ensure efficiency.
They only have that behaviour in |
I agree that 2 was the real goal for which absent aliases were introduced. The normal form is always known statically. |
If you allow them in module type definitions then the difference will be observable by the user because, whilst absent and present aliases are both subtypes of each other, they are not equal types. Everywhere else you are either covariant or contravariant and so the difference is not detectable. As discussed at the meeting, if you add an actual syntax for absent aliases (I would suggest |
1c7bdb1
to
fd62df1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few small fixes needed. Once they're addressed I think this is ready to go.
@@ -58,11 +58,3 @@ File "w53.ml", line 46, characters 23-36: | |||
46 | module I' = Set.Make [@ocaml.inlined] | |||
^^^^^^^^^^^^^ | |||
Warning 53 [misplaced-attribute]: the "ocaml.inlined" attribute cannot appear in this context | |||
File "w53.ml", line 48, characters 23-30: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this change is revealing an existing issue with detecting incorrectly placed [@@inlined]
attributes. The following should warn but doesn't:
# module M = struct type t end[@@inlined];;
module M : sig type t end
Module aliases with -no-alias-deps
were are now handled by the same code as this case, and so they have lost their warning.
The confusion of @xavierleroy and @nojb suggests that it would be worth taking a step back and:
|
I've pushed an implementation that handles this case properly (tested using this tweaked branch, which allows functor aliases in signatures). Using the template: module G
(Inputs : sig
module type S
module type Res
module F : functor (X : S) -> Res
end)
(X : Inputs.S)
(Arg : sig
module A = Inputs.F(X)
module B = Inputs.F(X)
module C = B
module D == B
module Bar (Y : sig module M = A end) : sig type t end
module Baz : sig module M = B end
type t = Bar(Baz).t
end) = Arg;; and modifying
@garrigue I'm fairly confident: if we have an absent alias to a (non-functor) path I'm not sure which example you're referring to, so I'm not clear on whether there are still edge-cases we're missing. |
This ensures that we correctly reject e.g. M = G(A) ((_ : sig module N = F(G(A)) end) : sig module N = F(M))
Pushed another small tweak to make sure that module G
(Inputs : sig
module type S
module type S2
module type Res
module F : functor (X : S) -> Res
module G : functor (X : S2) -> S
end)
(Y : Inputs.S2)
(Arg : sig
module X = Inputs.G(Y)
module A = Inputs.F(X)
module B = Inputs.F(X)
module C = B
module D == B
module Bar (Y : sig module M = Inputs.F(X) end) : sig type t end
module Baz : sig module M = Inputs.F(Inputs.G(Y)) end
type t = Bar(Baz).t
end) = Arg;; also doesn't typecheck. Edit:
With this, the semantics for present aliases is that they indicate that the modules have the same provenance -- ie. they were created by combining the same underlying modules, optionally including functor applications (although not enabled in this PR), and the subtyping rule lets any of the underlying modules be weakened to the functor application that constructed it. |
Your tweak is just trying to go back to the approach of attempting to keep track of functor arguments and treat them specially. It will hit all the same issues as that system, and is the very thing we are trying to move away from. The issue here is a fundamental one, not an artifact of the particular choice of normalization algorithm. Whenever we try to use typing rules that are not stable under substitution we end up making something that is fragile and complex, I'm really keen to not make that mistake again. |
I suspect we have different goals here: I want to support functor aliases, and I want a module type system that still makes sense once they're there. Personally, I'm not trying to move away from anything. Perhaps we're at an impasse. For me as a user of OCaml, it's not acceptable to have a module type system that confuses the locations of mutable data. I strongly don't want anything that will cause this to compile: module F(A :S) = struct let x = ref A.x end
module M = F(A)
module N = F(A)
module Do_thing_F (A : sig module N = N end) = struct
let do_thing () = incr A.N.x
end
module Do_thing = Do_thing_F(struct module N = M end)
let res = Do_thing.do_thing (); !N.x
I have sympathy for this, but I think building a slightly more complex system that supports functor aliases and (ascribed) functor arguments, will end up more maintainable. Concretely: once we have them, we can add tests for the interactions of these in all the various substitution kinds, and then the tests will let us know if something is awry. At the moment, things feel far more fragile because we don't have good ways to test the interactions, and so we have to be cautious that they don't leak in through some edge case. |
Having said that the issue with that approach is fundamental, there is a fundamental change which Jacques has previously suggested that would allow it to work. It would also fix the other surprising behaviours that you can get from impure applicative functors like my The change would be to disallow functor applications from the arguments of other functor applications in paths. So types like module M = Set.Make(Choose_an_ordering(X)) it would be type-checked the same as: module M = Set.Make(struct include Choose_an_ordering(X) end) is now. Similarly, module type expressions of the form: S with module Foo := Bar(Baz) would be banned. Using nested functor applications in paths like this is probably quite rare, but it would definitely require checking how much of opam was broken by the change. I think it might also technically decrease the theoretical expressivity of the module system. For example, it is not clear that my implementation of Girard's paradox would still be possible. In an ideal world, I think we would instead require that applicative functors be pure, and then not have to worry about any of this. Possibly that will be feasible once there is an effect system. |
I think your example above says more about the hazards of un-signalled mutability than anything about the module system. I'm not sure it's something that we need to fundamentally alter the module system to fix. To my mind
|
It's not really about making a slightly more complex system. The issue is that I don't believe it is possible to build a system that distinguishes functor parameters and functor applications that isn't broken. Certainly all previous attempts have been. The idea runs so counter to the underlying ideas behind the module system that there just always ends up being some way around it. To avoid this I think that whatever rules that we use for functor applications will need to be the rules for all present aliases. What happens if you apply the rules you are proposing for aliases containing functor applications to all present aliases? My understanding is that it would basically mean that this would be an error: module M = struct end
module N = M
module Foo : sig
module O = N
end = struct
module O = M
end whilst this would work fine: module M = struct end
module N = M
module Foo : sig
module O = M
end = struct
module O = N
end That's a bit odd, but I think it would probably work fine. It would avoid all the issues with distinguishing functor applications, whilst also ensuring that |
I believe that it ought to be possible, and that we're getting closer to that system with the current version of this PR. I realise that this isn't a particularly useful sentiment, so perhaps I can propose a way forward: If I can
does this leave us in a place where we can consider merging this? Of course, I would rather avoid 'completing the journey' before moving forward with this PR, but maybe it's what needed to figure out if this additional complexity in service of an impossible goal. |
I think I would prefer exploring the suggestion in my previous message first. It's a much simpler approach and I think it addresses all the concerns raised so far. In some sense it is the correct approach if you interpret |
I'm not keen on this suggestion, on the grounds that it breaks existing code if we can only 'go forward' through aliases. If we were starting from scratch, it might be less objectionable. It also leads to some awkwardness with short aliases, e.g. module F (X : sig module M = Foo.Bar.Baz (* ... *) end) = struct (* ... *) end
(* ... *)
module Baz = Foo.Bar.Baz
(* ... *)
module Res = F(struct module M = Baz (* ... *) end) (* :'( *) For now, I'm going to continue to work through the direction I proposed above. |
Design is still in flux, not going to merge the version that was reviewed.
@mrmr1993 On way to understand what you are saying is that you define the subtyping and equivalence relations in terms of expansion.
With this definition, my counter-example does not apply anymore, and I think that you get more or less what you want. Building the correct coercions can still be complicated, and the subtyping relation may be restrictive and hard to check. |
At this point I think it would be useful for me to summarise my OCaml allows impure applicative functors. This means that paths can be As long as you're only projecting types out of these impure module paths
I can see three basic solutions here:
Note that 2 and 3 do not address the issue with functor arguments, which Any solution other than those 3 will amount to a sort of effect system The existing checks for functor arguments when making absent aliases is Personally, I suspect that anything beyond the "global paths" approach |
The summary of @lpw25 is very clear, and its problem analysis corresponds to my non-expert understanding of the situation as I read this surprisingly-intricate thread. I have the same general sentiment that impure applicative functors are known to introduce issues in the rest of the system, and that the fact that they interact badly with a new feature is something we should be willing to accept (instead of making the other features more complex to try to work around impure-applicative problems). @mrmr1993 has an example in #10673 (comment) that he finds "not acceptable", but to me it is not clearly more or less acceptable than other known-issues with impure applicative functors, which do weaken some expected abstraction properties when they are used. In either case, type-safety is preserved, some other form of reasoning is made invalid, and it's not clear that this is hurting users in practice -- somehow people tend to not define functors that generate observable mutable state and apply it several times and do strange things with the resulting modules. Thinking out loud: when I read @lpw25 description, one relatively natural idea is to ask for (2), with some approximate analysis of module expression purity: |
I'm still not keen on any of these, I'm afraid. I've been imagining a slightly more complex solution, with a few basic rules, in order of applicability:
Then,
With these rules and checks, we always have well-behaved module aliases AFAICT. This does, of course, require ascription to be included in this PR.
@lpw25 for your first example, I still maintain that the typing is correct, just that the behaviour is confusing. The types really are the same. That we want the types to depend on the value of To your second, I think the proposal above should handle this correctly. |
Your description seems to be relying on checking properties during functor application and similar. In general, you can't rely on such checks to preserve well-formedness. Abstraction allows the issues with a particular functor application to become apparent only later on. For example, something like the following appears to be allowed by your system: module Foo (X : Set.S) = struct
module type T = sig
module A == X
end
end
module Bar (F : S -> sig module type T end) = struct
module type S = sig
module N : F(Set.Make(Int)).T
end
end
module Baz(Y : Bar(Foo).S) = struct
let x = Y.N.A.empty
end The fundamental issue is that the type of a functor's argument needs to contain all information for deciding if a particular functor application is valid or not. Trying to use additional information from the return type of the functor doesn't really work.
I disagree: the whole point of applicative functors is to allow types to depend on values. That's why we have
Yeah, I think if you are going to treat some module definitions as pure and some as impure then you will need to reflect that purity into their module types or their bindings so that users can abstract them -- maybe something along the lines of this paper. I think that the complexity of such a system is probably not worth it. |
The functor application
I model functors differently: they're a tool for abstracting code over some interface (edit: and types over types in that interface). If we want the same for types -- which, as you say, we don't have in practice -- we should amend the type system to allow modules as the arguments to types. As a point of information: I use impure functors to manage some state which depends on the types in the functor arguments, and to perform some type-dependent initialisation. I would strongly prefer that we don't break that production code by making all functors pure. I have also suffered with mixing of values from different instantiations of these functors, but it's not useful for me to parameterise the type over the arguments, because the backing state may still differ. If, however, we allow the type system to take arguments from the module system, then I can correctly express that the types depend on state from the particular functor application.
I would love to revisit this when we have some form of typed effects, but I agree that it's not worth pursuing for now. |
In order to reject this application you need to re-expand and recheck the functor application It is not clear from your description whether the following is allowed. module M = struct let x = Random.int 10 end
module N = M
module Res : sig
module O = N
end = struct
module O = M
end Since I can't see any mention of functor bodies in your description, I assume that either the above example isn't allowed, or the following example is: module Func (X : sig end) : sig
module M : sig val x : int end
module type S
module N = M
module Foo (Y : S) : sig module O = N end
module Bar : S
end = struct
module M = struct let x = Random.int 10 end
module N = M
module type S = sig module O = M end
module Foo (Y : S) : sig
module O = N
end = struct include Y end
module Bar = struct
module O = M
end
end
module Arg = struct end
module A = Func(Arg)
module B = Func(Arg)
module C : sig module O = A.N end = A.Foo(B.Bar)
assert (C.O.x == A.N.x) (* sometimes fails *) Of course, if the first example isn't allowed then we are basically back to option 2 from my list. |
Apologies for not detailing this in the description. My intention is to use a two-stage normalisation procedure, where first you normalise to a concrete module (if possible), and then you continue expanding to find the constructive path. Then, after normalisation, equal concrete paths are equal, equal constructive paths are equal, and a concrete path is included in its constructive path (if there was no corresponding concrete path). You can find that implemented in this PR here. Under that scheme,
While this is true, we can cheaply mark functor arguments as In fact, you could imagine making these user-visible, and then you could declare |
|
Ah, yes. I'm sure it sounds like I keep proposing language features to fix issues, but perhaps we need a concept of private signatures, so that module Func (X : sig end) : sig
module M : sig val x : int end
module type S = private
module N = M
module Foo (Y : S) : sig module O = N end
module Bar : S
end and its instantiations get types module A : functor (X : sig end) -> sig
module M : sig val x : int end
module type S = private Func(X).S
module N = M
module Foo (Y : S) : sig module O = N end
module Bar : S
end where I could imagine this might be useful more generally, to say that e.g. a functor only accepts modules that were output by another functor, or only from an explicit set of modules, but still revealing the module contents. |
Private module types are potentially useful. Note that the more general version that allows any subtype of the specified type is quite difficult to implement because the private module type needs to contain the required coercion at runtime. I don't really see how it helps with my example. I don't think you can just treat: module type S = E as if they had written: module type S = private E based on the content of After further consideration, I'm no longer sure that my option 2, and by extension your proposal, can really work for preventing non-physically-equal module aliases. It tries to work by constraining normalization in the definitions of module aliases, but aliases are not the only form of equality available to modules. For example, the following example would create non-physically-equal module aliases without doing any normalization in the definitions of module aliases: (* Make an impure functor and two aliases to its output *)
module type Int = struct val x : int end
module Impure (X : sig end) : Int = struct let x = Random.int 100 end
module Unit = struct end
module A = Impure(Unit)
module B = Impure(Unit)
(* Make a Leibniz equality between A and B *)
module type Type = sig module type T end
module type Pred_Int = Int -> Type
module Equal_A_B (P : Pred_Int) (X : P(A).T) : P(B).T = X
module Alias_to (X : Int) = struct
module type T = sig module M = X end
end
module Alias_to_A : Alias_to(A).T = struct
module M = A
end
module M : Alias_to(B).T = Equal_A_B(Alias_to)(Alias_to_A) At this point I think that option 3 is the only viable option (with option 1 only possible in some distant future). |
That wasn't my intention; instead, if they write
Your example would have been prevented by this rule: casting |
Ok, so you are proposing to ban abstraction for module types that meet some criteria. What is the criteria precisely? Preventing module type abstraction isn't sufficient. Here's the same example without using module type abstraction: (* Make a Leibniz equality between A and B *)
module type Type = sig type t end
module type Pred_Int = Int -> Type
module Equal_A_B (P : Pred_Int) = struct
let coerce (x : P(A).t) : P(B).t = x
end
module Alias_to (X : Int) = struct
module type T = sig module M = X end
type t = (module T)
end
module Coerce = Equal(Alias_to)
module M = (val Coerce.coerce (module struct module M = A end) : Alias_to(B).T) |
Here's another example that causes trouble for option 2: module type Int = struct val x : int end
module Impure (X : sig end) : Int = struct let x = Random.int 100 end
module Unit = struct end
module A = Impure(Unit)
module B = Impure(Unit)
module Coerce(F : Int -> sig module N : Int end) : sig
module M = F(A)
end = struct
module M = F(B)
end
module Id (X : Int) = struct
module N = X
end
module O : sig
module M : sig
module N = A
end
end = Coerce(Id) which only requires normalization of "constructive" paths in the parameters of aliases to other constructive paths. |
That they contain a present alias to some module instantiated within the functor that they also are a part of. The rule is slightly more subtle than banning abstraction; I'd like to let them be abstract, but require an explicit coercion to expand them to their constructive path. Specifically, these
I should have thought of this 🤦♂️ This is effectively the same but using a type; we could do the same for types, although this has some weirder interactions that I need to think through properly.
It seems you're still considering these options, so I'll detail some concerns
This is a step backwards in functionality, and breaks existing programs. The 'correct' way to do this is with effects, but we seem not to be close to having a system that can track these in the ways we would need.
This is a disaster for code that uses e.g. an module Foo_ = Foo
open Bar
let x : Baz.F(Foo_).t = Bax.x
This causes the module system to confuse the origin of values, allowing the user to write code that does module T : sig module M = N end = ...
let () = T.M.x := 1 where the value modified isn't |
This PR changes the default behaviour of module aliases to make all modules present (
Mp_present
) by default. This will allow us to support functor aliases, and will simplify the implementation of transparent ascription (previous implementation in PR #10612).The changes here are based on the discussion in the comments of #10435.
With this PR odule aliases (
module Foo = Bar
) are now always 'present', in the sense thatBar
can be accessed via the aliasFoo
at runtime, instead of transparently rewritingFoo
toBar
in the codegen logic.Breaking changes
module rec
s which contain module aliases. For example:Stdlib
,Stdlib__Int
,Parsetree
, etc.) are still treated as absent aliases when--no-alias-deps
is true. This carve-out is deliberately limited to avoid doing anything overly complex, but still allowing dune-style 'library' modules, which declare these aliases before the aliased modules are available and would be broken otherwise.--no-alias-deps
, but it is no longer possible to alias any submodule.Notes for reviewers
typemod.ml
, focused on passing around theabsent_globals
state, and using this state to determine whether aliases should be present or absent.Mtype.make_aliases_absent
has been replaced withmake_aliases_present
, used to convert the module type returned bymodule type of
to one with all present aliases, if it is being used as part of a named signature.Includemod.modtypes
now places both modtypes in the environment, so that twoMty_alias
es can be expanded correctly to build a coercion between them, now that they may have incompatible runtime layouts.typing-modules/aliases.ml
has been added to test the above, extending the previous test to make sure that child modules also behave correctly.