Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add COALTON:DEFINE-INLINE to define inline functions and methods #1029

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

digikar99
Copy link

@digikar99 digikar99 commented Oct 20, 2023

This PR is a sketch to add inlinable functions and methods to coalton.

The intended use is as follows:

(coalton-toplevel
  (declare foo (single-float -> single-float))
  (inline) ; <--- THIS ATTRIBUTE
  (define (foo x)
    (lisp single-float (x) (cl:+ x x))))

(coalton-toplevel
  (define (foo-caller x)
    (foo x)))

Without foo being inline, the disassembly of foo-caller would have included a call to foo as follows:

COALTON-USER> (disassemble 'foo-caller)

; disassembly for FOO-CALLER
; Size: 53 bytes. Origin: #x540AAB0D                          ; FOO-CALLER
; 0D:       498B4510         MOV RAX, [R13+16]                ; thread.binding-stack-pointer
; 11:       488945F8         MOV [RBP-8], RAX
; 15:       4883EC10         SUB RSP, 16
; 19:       488BD6           MOV RDX, RSI
; 1C:       B902000000       MOV ECX, 2
; 21:       48892C24         MOV [RSP], RBP
; 25:       488BEC           MOV RBP, RSP
; 28:       E8D54F33FC       CALL #x503DFB02                  ; #<FDEFN FOO>
; 2D:       480F42E3         CMOVB RSP, RBX
; 31:       488B75F0         MOV RSI, [RBP-16]
; 35:       80FA19           CMP DL, 25
; 38:       7503             JNE L0
; 3A:       C9               LEAVE
; 3B:       F8               CLC
; 3C:       C3               RET
; 3D: L0:   CC55             INT3 85                          ; OBJECT-NOT-SINGLE-FLOAT-ERROR
; 3F:       08               BYTE #X08                        ; RDX(d)
; 40:       CC10             INT3 16                          ; Invalid argument count trap

But now that foo can be inlined, the disassembly of foo-caller is:

COALTON-USER> (disassemble 'foo-caller)

; disassembly for FOO-CALLER
; Size: 31 bytes. Origin: #x540AAE7F                          ; FOO-CALLER
; 7F:       498B4510         MOV RAX, [R13+16]                ; thread.binding-stack-pointer
; 83:       488945F8         MOV [RBP-8], RAX
; 87:       0F28D1           MOVAPS XMM2, XMM1
; 8A:       F30F58D1         ADDSS XMM2, XMM1
; 8E:       660F7ED2         MOVD EDX, XMM2
; 92:       48C1E220         SHL RDX, 32
; 96:       80CA19           OR DL, 25
; 99:       C9               LEAVE
; 9A:       F8               CLC
; 9B:       C3               RET
; 9C:       CC10             INT3 16                          ; Invalid argument count trap

This not only extends to regular functions, but also to class method instances. Effectively, this brings us inlinable static dispatch with zero runtime overhead for calling simple functions.


In an earlier PR, it was suggested that a coalton:inline declaration may be added to define inline functions and methods. I felt a slight inclination towards a coalton:define-inline form instead of a coalton:inline declaration. Nonetheless, if a declaration seems better, I will attempt to modify the PR accordingly. I am adding it as an attribute for the moment instead of any separate toplevel form. If someone has a preference for a toplevel form, they can always write a macro. The case for converting it into a CL-like declaration for more granular control remains.

This is still a sketch since I haven't looked over every case carefully. I'd be also be delighted if this facility could be achieved by more minimal changes.

@stylewarning
Copy link
Member

I haven't reviewed but I just wanted to leave a few pre-comments.

  1. Amazing job diving in and making such a broad change. That's a lot of work, and I definitely want to see this PR go through.

  2. I suspect we will want a few declarations. One thing we want to do that CL doesn't do is heuristic inlining. I don't know what design we'd choose but things like small lambdas that don't escape or unexported functions might be heuristically inlined by default. (A fully fledged heuristic inliner isn't necessary for this PR, but probably the groundwork for making it happen is.)

Excited to see this. Inlining will be such a huge performance lever in Coalton that could transform it from being "inadequate" to being "faster than Lisp by default".

@digikar99
Copy link
Author

digikar99 commented Oct 21, 2023

Thank you for the pre-review! I'm still thinking what the right thing to do would be. But yes, I definitely want to see inlining in coalton.

One main concern (EDIT: Fixed below using a node-let) I have with this approach is that it depends on the implementation to perform the inlining - see the following changes in src/codegen/program.lisp :

   (loop :for (name . node) :in bindings
         :if (node-abstraction-p node)
           :append (list
                    (if (node-abstraction-inline-p node)
                        `(declaim (inline ,name))
                        `(declaim (notinline ,name)))
                    (compile-function name node env)
                    `(setf
                      ,name

This means that suppose coalton:inline and coalton:notinline were made similar to cl:inline and cl:notinline like declarations. In that case, granular inlining as in coalton::(let (...) (declare (not/inline ...)) ...) will need to emit (cl:declare (cl:not/inline ...)) declarations. I want to hear others' thoughts on this.

As I understand, avoiding this would require some more work, so that inlining can be performed within coalton compilation without depending on the implementation.

@stylewarning
Copy link
Member

We should definitely attempt to inline ourselves (at the AST level) so that further type and monomorphizer/specializer optimizations. Maybe in specific last-mile optimizations we would use the Lisp inliner. (I'm happy Lisp has inlining, but it's a very sharp and specific tool.)

@eliaslfox
Copy link
Collaborator

Hey, thanks for working on this.

Wether a function is marked inline should be tracked in the function-environment instead storing a flag on each ast node. The function environment is set for functions here. It looks like this isn't being set for method definitions, but it can be set in this loop.

I would prefer to use attributes instead of adding a new toplevel form:

;; toplevel functions
(inline)
(define (f x) ...)

;; methods
(define-instance (C T)
  (inline)
  (define (m x) ...))

The inliner should be written as a separate optimization pass instead of being merged with the inline-method pass. The purpose of the method inlining pass is to directly call methods when the type class instance being used is known at the call site.

Lastly, the inliner should generate fresh variable names instead of reusing the names of the functions's parameter names. This way we avoid issues with overlapping variable name. You can generate variable names with gentemp, and rewrite the variable references in an ast node with ast substutions.

@digikar99
Copy link
Author

Thanks for the extensive feedback! I will rework this over the next week or two and push the commits for further review.

Yes, now, I'm in favour of attributes over a new toplevel form. If a user wants, they can write a macro which expands into the attribute+form:

(defmacro define-inline (name &body body)
  `(progn
     (inline)
     (define ,name ,@body)))

However, in the longer run, if we wish to provide more granular SBCL-like control over inlining, we might need something akin to declarations.

(declaim (inline foo))
(defun foo (a)
  (+ a a))
(declaim (notinline foo))

(defun foo-inline-caller (b)
  (declare (inline foo))
  (foo b))

(defun foo-mixed-caller (b)
  (declare (inline foo))
  (+ (foo b)
     (locally (declare (notinline foo)) ; <------ Are coalton attributes suitable for this?
       (foo b))))

@digikar99
Copy link
Author

Wether a function is marked inline should be tracked in the function-environment instead storing a flag on each ast node. The function environment is set for functions here. It looks like this isn't being set for method definitions, but it can be set in this loop.

It looks like creating a function-inline-environment might be a better idea, both with respect to minimal code changes as well as the granular control over inlining as outlined above.

Without that, it seems we will need to change function-env-entry structure to include an inline-p slot in addition to the existing name and arity. Doing this requires changing all the places where the function-env-entry objects are being created to set the :inline-p slot. One such example is this update-function-env function in codegen/transformations.lisp. However, the value of inline-p itself requires more information. This information could either be obtained by adding a parameter inline-bindings to update-function-env itself, but this seems a bit unclean. Another way is to obtain this information from split-binding-definitions, but this requires information about inlining to be present in the AST itself, which I understand you prefer to be avoided.

Let me know if I should proceed by adding a function-inline-environment or if some other way is preferable.

@eliaslfox
Copy link
Collaborator

In every place set-function is called except in update-function-env whether the function should be inline is known at the call site. update-function-env can instead use a different env updater to set the arity without changing inline-p:

(define-env-updater set-function-arity (env symbol arity)
  (declare (type environment env)
           (type symbol symbol)
           (type fixnum arity))
  (let ((prev (lookup-function env symbol :no-error t)))
    (update-environment
     env
     :function-environment (immutable-map-set
                            (environment-function-environment env)
                            symbol
                            (make-function-env-entry
                             :name symbol
                             :arity arity
                             :inline-p (and prev (function-env-entry-inline-p prev)))
                            #'make-function-environment))))

@digikar99
Copy link
Author

Sorry, I've been busy. It's been great to see lots of activity on coalton recently!

I might be free to dive into this in detail in another month. Is there any suggestion to break the changes required for this into multiple parts? That way, each part will be easier to review and merge.

Copy link
Collaborator

@eliaslfox eliaslfox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey! Don't worry about breaking this up, it's pretty reviewable.

Sorry I dropped the ball on reviewing this.

This needs some tests, but other than that and a couple nits this looks good to merge.

(node-type node))
:value name)
:rands rands)))))))
;; FIXME: What kind of frontend code evokes DIRECT-APPLICATION
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pass turns regular calls into direct calls when it's known that regular function calls can be used.

(when (node-variable-p rator)
(let* ((name (node-variable-value rator))
(code (tc:lookup-code env name :no-error t))
;; FIXME: We need to lookup inline-p in the environment rather than the AST
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this FIXME is done


(assert (cst:consp form))

;; TODO: Parsing errors
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

@@ -358,7 +360,9 @@
:if (not (zerop method-arity))
:do (setf env (tc:set-function env method-name (tc:make-function-env-entry
:name method-name
:arity method-arity)))
:arity method-arity
;; FIXME: Always declare methods to be inline?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should do this

@@ -334,7 +334,9 @@
:if (not (zerop class-arity))
:do (setf env (tc:set-function env codegen-sym (tc:make-function-env-entry
:name codegen-sym
:arity class-arity)))
:arity class-arity
;; FIXME: Under what circumstances would we want fundep to be inline?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class constructor functions can't be inlined by the Coalton inliner because they don't appear in the code env.

@@ -532,7 +529,9 @@
:vars nil
:var-names nil
:body (list (util:runtime-quote (type-definition-runtime-type type)))))
:source source))
:source source
;; FIXME: Is NIL correct?
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes

(body (util:required 'body) :type node-body :read-only t))
(params (util:required 'vars) :type pattern-list :read-only t)
(body (util:required 'body) :type node-body :read-only t)
(inline-p (util:required 'inline-p) :type boolean :read-only t))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is unnecessary because it's tracked in the function environment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants