Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ext/bcmath: Avoid unnecessary memset #14180

Merged
merged 2 commits into from
May 16, 2024

Conversation

SakiTakamachi
Copy link
Member

@SakiTakamachi SakiTakamachi commented May 8, 2024

0 padding on the right is not required for calc. This is because when converting back to string, zend_string is padded with 0.

*sptr++ = BCD_CHAR(0);

(I will follow up for add later.)

benchmark: #14132

before

// 1
Time (mean ± σ):     551.0 ms ±   8.4 ms    [User: 546.6 ms, System: 3.7 ms]
Range (min … max):   536.9 ms … 563.4 ms    10 runs

// 2
Time (mean ± σ):     610.8 ms ±  16.4 ms    [User: 607.4 ms, System: 2.6 ms]
Range (min … max):   599.4 ms … 653.7 ms    10 runs

// 3
Time (mean ± σ):     707.8 ms ±  21.3 ms    [User: 704.4 ms, System: 2.6 ms]
Range (min … max):   681.0 ms … 754.1 ms    10 runs

after

// 1
Time (mean ± σ):     523.3 ms ±   3.9 ms    [User: 519.3 ms, System: 3.2 ms]
Range (min … max):   518.5 ms … 529.4 ms    10 runs

// 2
Time (mean ± σ):     591.5 ms ±  13.4 ms    [User: 587.8 ms, System: 2.9 ms]
Range (min … max):   575.3 ms … 620.9 ms    10 runs

// 3
Time (mean ± σ):     685.9 ms ±  10.2 ms    [User: 681.8 ms, System: 3.3 ms]
Range (min … max):   668.6 ms … 696.7 ms    10 runs

@nielsdos
Copy link
Member

nielsdos commented May 8, 2024

Intuitively, this speed gain feels strange because memset only takes up 0.35% of runtime on my machine for 1.php for example. Furthermore, I actually see a slowdown when applying this patch on current master:

Benchmark 1: ./sapi/cli/php 1.php
  Time (mean ± σ):     436.2 ms ±   6.2 ms    [User: 432.7 ms, System: 2.8 ms]
  Range (min … max):   430.3 ms … 450.7 ms    10 runs
 
Benchmark 2: ./sapi/cli/php_old 1.php
  Time (mean ± σ):     418.8 ms ±   8.3 ms    [User: 415.5 ms, System: 2.5 ms]
  Range (min … max):   412.3 ms … 440.9 ms    10 runs
 
Summary
  ./sapi/cli/php_old 1.php ran
    1.04 ± 0.03 times faster than ./sapi/cli/php 1.php


Benchmark 1: ./sapi/cli/php 2.php
  Time (mean ± σ):     499.6 ms ±   3.1 ms    [User: 496.0 ms, System: 2.7 ms]
  Range (min … max):   494.7 ms … 503.3 ms    10 runs
 
Benchmark 2: ./sapi/cli/php_old 2.php
  Time (mean ± σ):     475.8 ms ±   3.4 ms    [User: 473.5 ms, System: 1.4 ms]
  Range (min … max):   472.0 ms … 480.1 ms    10 runs
 
Summary
  ./sapi/cli/php_old 2.php ran
    1.05 ± 0.01 times faster than ./sapi/cli/php 2.php

I see a similar slowdown for your old benchmark script with microtime.
It doesn't make sense that the execution becomes slower, I think something deeper is going on especially as you see the opposite effect...

@SakiTakamachi
Copy link
Member Author

Thanks. It seems the measurements on my machine are unreliable....

I will try to prepare a stable measurement environment such as EC2.

@SakiTakamachi
Copy link
Member Author

@nielsdos
I merged the latest master and then compared master and this branch on EC2.

master:

hyperfine "php 1.php" --warmup 10
Time (mean ± σ):     654.7 ms ±   3.0 ms    [User: 650.5 ms, System: 2.6 ms]
Range (min … max):   650.1 ms … 659.2 ms    10 runs

hyperfine "php 2.php" --warmup 10
Time (mean ± σ):     769.4 ms ±   5.4 ms    [User: 765.8 ms, System: 2.0 ms]
Range (min … max):   762.6 ms … 781.8 ms    10 runs

hyperfine "php 3.php" --warmup 10
Time (mean ± σ):     910.3 ms ±  13.3 ms    [User: 905.6 ms, System: 2.8 ms]
Range (min … max):   896.8 ms … 934.6 ms    10 runs

php old.php // my old bench
1.6298861503601
1.9048039913177
2.2188358306885

this branch:

hyperfine "php 1.php" --warmup 10
Time (mean ± σ):     605.0 ms ±   2.5 ms    [User: 601.5 ms, System: 2.0 ms]
Range (min … max):   601.3 ms … 609.0 ms    10 runs

hyperfine "php 2.php" --warmup 10
Time (mean ± σ):     717.5 ms ±   5.0 ms    [User: 713.1 ms, System: 2.7 ms]
Range (min … max):   712.5 ms … 730.7 ms    10 runs

hyperfine "php 3.php" --warmup 10
Time (mean ± σ):     863.4 ms ±   9.3 ms    [User: 858.7 ms, System: 2.6 ms]
Range (min … max):   851.4 ms … 878.6 ms    10 runs

php old.php // my old bench
1.5056660175323
1.7816500663757
2.116672039032

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, fine by me then.
Something influences my results and I'm not sure what. Might be caches or micro-architectural stuff.

@SakiTakamachi
Copy link
Member Author

Is the compiler gcc?

I am compiling both with gcc on ubuntu. It might be worth checking if a different assembly is being generated.

@nielsdos
Copy link
Member

nielsdos commented May 9, 2024

I'm using gcc 13

@SakiTakamachi
Copy link
Member Author

My machine is 9, but the EC2 is 13.

Could you please send me the assembler code later when you have time? I'd like to compare it with mine.

@nielsdos
Copy link
Member

nielsdos commented May 9, 2024

Could you please send me the assembler code later when you have time? I'd like to compare it with mine.

This is the assembly for this PR's doaddsub.s: https://gist.github.com/nielsdos/551b59b07a25df46b8416edc31f89a73
This is the assembly for master's doaddsub.s: https://gist.github.com/nielsdos/5a6200fdb5173c391c23a03c0f7a7dbc

However, I'm not so sure the problem is the assembly, I think it's some CPU magic (e.g. microarchitecture or cache state or something like that) that is coincidentally influenced with this change.

@SakiTakamachi SakiTakamachi merged commit 959ea5f into php:master May 16, 2024
10 checks passed
@SakiTakamachi SakiTakamachi deleted the refactor_bcmath_do_sub_2 branch May 16, 2024 23:03
SakiTakamachi added a commit that referenced this pull request May 17, 2024
Apply the same changes as #14180 to _bc_do_add.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants