New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow more CSE of loads from immutable blocks #367
Conversation
(I should add that Leo's patch is a prerequisite for multicore.) |
Various remarks in random order:
|
Yes, this one should probably not be a prerequisite for flambda actually. Thanks for the other comments; I will make Cmmgen less conservative and get some numbers on other benchmarks. (I'm not sure that KB is within noise actually, although I take your point.) |
Oh, and as regards header words---I thought we couldn't assume that in the compiler due to Obj.truncate. |
Concerning Obj.truncate and CSE, it is a runtime function, not an inlined primitive, and we don't CSE across function calls anyway (because it hurts register allocation most of the time). Obj.set_field, on the other hand, is a lot more problematic... |
The unsafe cast for destination-passing lists in Batteries is used to coerce a variable from a mutable type to an immutable one. In particular, no value is created at an immutable type then coerced to mutable, so there is no read on the immutable value that could persist across a mutation (as no mutation are ever done after the conversion is made). There is another part of Batteries that coerces a variable at an immutable type to a mutable type, and then does a mutation. We are careful never to create value at the immutable type (they are create at the mutable type, then coerced). In any case, the soundness of this unsafe write is justified by the fact that it does not change the value semantically (it is a rebalancing in a splay tree), so in particular optimizing a read to a stale value (ignoring a rebalancing) is sound and correct, it is just less efficient in one particular case. (To avoid this, we would need to refine the analysis to preserve the immutable loads at a store, but invalidate those that can be proved to be at the same location; or to not propagate "immutable" type information if a value that can be proved equal is typed "mutable" in another part of the program. I don't know precisely which kind of type-directed propagation of mutability the compiler does today (as opposed to mutability of constructors or fields), there may be none at all.) |
This pull request has been asleep for more than a year. What do we do with it? I note that part of this PR (the mutability information over loads in Cmm) is already in 4.05. |
A very conservative version of the mutability-in-Cmm stuff is in 4.05, yes. I need to discuss a couple of things with people about the remainder of this patch and will update this pull request again shortly. |
@let-def and I discussed the interaction between this kind of CSE and TRMC. The conclusion was that it would be safe. I will clean up this patch to remove the parts that have been superceded by the mutability annotations now in trunk. As far as header words go: @stedolan would like to prohibit the changing of the tag on existing blocks. A new function would be provided (if I remember correctly) to copy a block and give the copy a new tag. @alainfrisch I seem to remember you make some use of tag changing: would such a function suffice for your purposes? If we could do this then I think we can CSE loads of header words. |
@alainfrisch Ping, on the tag changing thing above. |
Isn't it the case that since Anyway, it would have to use a function that copies a block and changes its tag, instead of set_tag, this would create some extra allocations in rather critical parts, but I don't want our use of undocumented hacks to stop progress, so please consider we are ok with the change. FWIW, our main use of set_tag is to set |
@alainfrisch Would it be acceptable for there to be a new primitive Obj.with_tag which copies a block and changes its tag, along with an optimisation that turns I care about this for multicore, where Obj.set_tag even less safe than it is currently, since other threads doing GC may access the tag in parallel. |
Don't worry about our use cases. As said, if what we do currently is no longer available, we will just add the ability to choose explicitly the tag when declaring the record type. We already maintain a custom version of the compiler and this extra patch seems tiny enough. |
Could we get an update on this PR? For the record, |
@mshinwell ping for benchmarks |
remove some unused rpc consumers
This PR appears to be stale since at least 2017 and has been independently reimplemented by #9562. I propose to close it in a few days, unless someone complains. |
This PR (#367) includes propagation of mutability information down to Cmm which was reimplemented later in #966 and merged in trunk. So, a large part of this PR is obsolete. #9562 starts with a Cmm contaning mutability information on loads, so it's a lot closer to the current code base. I move to close this PR (#367). |
Wish granted. |
* model the margin bottom parameter * remove webpack fs shim, not needed now
Signed-off-by: Christine Rose <professor.rose@gmail.com>
Morally speaking, this patch is small: it changes CSEgen so that values computed by loading from an immutable place are not killed by stores.
The specific near-term aim is to permit accesses to closure environments to be subject to CSE.
With flambda I measure this patch giving a 2.5% speedup on Knuth-Bendix, which is quite significant for a small change.
@lpw25 has a more complete implementation of the front-end portion of this patch on the multicore branch that, instead of adding Pimmutable_field, alters Pfield to always carry such information (together with a flag saying whether the value is always immediate). This could have fairly wide applicability, e.g. in reading from record fields, which more often than not are immutable. I would like to consider merging that patch in the future---I haven't got time right now---but I think this one is a reasonable step along the way.
(I will adjust the code for the other backends if this is accepted.)