Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unnecessarily caml_call_gc is called each time in a loop #12864

Closed
ytomino opened this issue Dec 26, 2023 · 2 comments
Closed

Unnecessarily caml_call_gc is called each time in a loop #12864

ytomino opened this issue Dec 26, 2023 · 2 comments

Comments

@ytomino
Copy link

ytomino commented Dec 26, 2023

In OCaml 4.13 or later, including the latest 5.1.1, Unnecessarily calling caml_call_gc is inserted in inside of loops, both of for-loops and recursive functions.

For example:

let f s = (
  let result = ref 0 in
  for i = 0 to String.length s do
    result := !result + 1
  done;
  !result
)

Note, ocamlopt can eliminate the allocation for ref that is not taken out of the function.
What triggers is code that might have done this allocation, but did not as a result, like this.
e.g. inlined and removed closures.

Edited: The above trigger is my misunderstanding. The exact trigger is explained at #10039 (comment).

Compile this with OCaml 4.12.1, in amd64 Linux:

camlUnnecessarygc__f_5:
	.cfi_startproc
.L102:
	movq	%rax, %rbx
	movl	$1, %eax
	movq	-8(%rbx), %rdi
	shrq	$10, %rdi
	leaq	-1(,%rdi,8), %rdi
	movzbq	(%rbx,%rdi), %rbx
	subq	%rbx, %rdi
	leaq	1(%rdi,%rdi), %rbx
	movl	$1, %edi
	cmpq	%rbx, %rdi
	jg	.L100
.L101:
	addq	$2, %rax
	movq	%rdi, %rsi
	addq	$2, %rdi
	cmpq	%rbx, %rsi
	jne	.L101
.L100:
	ret
	.cfi_endproc

Until this version, those were outputting efficient code.

OCaml 4.13.1:

camlUnnecessarygc__f_5:
	.cfi_startproc
	subq	$8, %rsp
	.cfi_adjust_cfa_offset 8
.L102:
	movq	%rax, %rbx
	movl	$1, %eax
	movq	-8(%rbx), %rdi
	shrq	$10, %rdi
	leaq	-1(,%rdi,8), %rdi
	movzbq	(%rbx,%rdi), %rbx
	subq	%rbx, %rdi
	leaq	1(%rdi,%rdi), %rbx
	movl	$1, %edi
	cmpq	%rbx, %rdi
	jg	.L100
.L101:
	addq	$2, %rax
	movq	%rdi, %rsi
	addq	$2, %rdi
	cmpq	%rbx, %rsi
	je	.L100
	cmpq	(%r14), %r15
	ja	.L101
	jmp	.L103
.L100:
	addq	$8, %rsp
	.cfi_adjust_cfa_offset -8
	ret
	.cfi_adjust_cfa_offset 8
.L103:
	call	caml_call_gc@PLT
.L104:
	jmp	.L101
	.cfi_adjust_cfa_offset -8
	.cfi_endproc

As you can see, (%r14) and %r15 are checked each time in the loop, and caml_call_gc may be called depending on the condition, even though they are not used.

OCaml 5.1.1:

camlUnnecessarygc.f_5:
	.cfi_startproc
	subq	$8, %rsp
	.cfi_adjust_cfa_offset 8
.L102:
	movq	%rax, %rbx
	movl	$1, %eax
	movl	$1, %edi
	movq	-8(%rbx), %rsi
	shrq	$10, %rsi
	leaq	-1(,%rsi,8), %rsi
	movzbq	(%rbx,%rsi), %rbx
	subq	%rbx, %rsi
	leaq	1(%rsi,%rsi), %rbx
	cmpq	%rbx, %rdi
	jg	.L100
.L101:
	addq	$2, %rax
	movq	%rdi, %rsi
	addq	$2, %rdi
	cmpq	%rbx, %rsi
	je	.L100
	cmpq	(%r14), %r15
	ja	.L101
	jmp	.L103
.L100:
	addq	$8, %rsp
	.cfi_adjust_cfa_offset -8
	ret
	.cfi_adjust_cfa_offset 8
.L103:
	call	caml_call_gc@PLT
.L104:
	jmp	.L101
	.cfi_adjust_cfa_offset -8
	.cfi_endproc

Same as 4.13.1.

It also can be reproduced in recursive functions.
Especially it is corresponds to most of the nested functions defined for recursion, even if they are inlined.

Is this the intended behavior?

@lthls
Copy link
Contributor

lthls commented Dec 26, 2023

Yes, this is intended. Check #10039 for the motivation behind this.

@ytomino
Copy link
Author

ytomino commented Dec 26, 2023

Thank you for your quick response!

I'm reading it...my impression at this point is that if the purpose is to check for interruptions, it is somewhat overdone...

I understand that it is as intended.

@ytomino ytomino closed this as completed Dec 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants