Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate memory_limit option #56

Open
wants to merge 9 commits into
base: 2.4.x
Choose a base branch
from

Conversation

leuchtdiode
Copy link

@leuchtdiode leuchtdiode commented Mar 17, 2024

Q A
Bugfix yes
BC Break no

Description

Passing invalid options via MemoryOptions#setMemoryLimit or via MemoryOptions#__construct, the actual memory_limit option is not properly validated.

…ormalizing memoy value; Added tests for setMemoryOptions()
src/MemoryOptions.php Outdated Show resolved Hide resolved
src/MemoryOptions.php Outdated Show resolved Hide resolved
src/MemoryOptions.php Outdated Show resolved Hide resolved
@boesing
Copy link
Member

boesing commented Mar 21, 2024

I am not really concerned about this change. This introduces handling for invalid memory_limit values.
I tested this on PHP 8.3 and even PHP itself is yelling at me:

→ php -dmemory_limit=M256 -i | grep memory_limit
Using php v8.3.2
PHP Warning:  Invalid "memory_limit" setting. Invalid quantity "M256": no valid leading digits, interpreting as "0" for backwards compatibility in Unknown on line 0
PHP Warning:  Failed to set memory limit to 0 bytes (Current memory usage is 2097152 bytes) in Unknown on line 0
Warning: Invalid "memory_limit" setting. Invalid quantity "M256": no valid leading digits, interpreting as "0" for backwards compatibility in Unknown on line 0
memory_limit => 128M => 128M

I would instead enhance the documentation to outline that in case that a string value, passed as memory_limit option to the adapter, should properly support PHPs internal memory declaration structure.

Do we have other opinions here? I mean, there was already quite some discussion.

@leuchtdiode
Copy link
Author

I am not really concerned about this change. This introduces handling for invalid memory_limit values. I tested this on PHP 8.3 and even PHP itself is yelling at me:

→ php -dmemory_limit=M256 -i | grep memory_limit
Using php v8.3.2
PHP Warning:  Invalid "memory_limit" setting. Invalid quantity "M256": no valid leading digits, interpreting as "0" for backwards compatibility in Unknown on line 0
PHP Warning:  Failed to set memory limit to 0 bytes (Current memory usage is 2097152 bytes) in Unknown on line 0
Warning: Invalid "memory_limit" setting. Invalid quantity "M256": no valid leading digits, interpreting as "0" for backwards compatibility in Unknown on line 0
memory_limit => 128M => 128M

I would instead enhance the documentation to outline that in case that a string value, passed as memory_limit option to the adapter, should properly support PHPs internal memory declaration structure.

Do we have other opinions here? I mean, there was already quite some discussion.

I'm with you at this. Why would someone expect a format to work which is no standard anywhere?

@boesing
Copy link
Member

boesing commented Mar 21, 2024

I'm with you at this. Why would someone expect a format to work which is no standard anywhere?

But that is what you are trying to support with this PR, no?
At least from what I can see in the test, you want to allow M256 as a valid memory limit. If that is not what you try to achieve, why are these changes needed in the first place? 😅

@leuchtdiode
Copy link
Author

leuchtdiode commented Mar 21, 2024

I'm with you at this. Why would someone expect a format to work which is no standard anywhere?

But that is what you are trying to support with this PR, no? At least from what I can see in the test, you want to allow M256 as a valid memory limit. If that is not what you try to achieve, why are these changes needed in the first place? 😅

No, this is a misunderstanding. The only thing I wanted to fix is the usage of ini_get('memory_limit') instead of the given parameter $value in normalizeMemoryLimit. This PR just got out of control because the tests are failing now because of my change 😂 And it raised the question on my side if we should prevent these wrong values from coming in.

I just added the test to "show" that it is currently wrong because it does return 256 (bytes). M256 and G1024 would go out of the test when we found a solution. Sorry, this was confusing from my side to put the two values into the data set.

@boesing
Copy link
Member

boesing commented Mar 21, 2024

Okay, let me rephrase what we want to achieve with this PR:

  • we do want to fix normalizeMemoryLimit method so that the value passed to setMemoryLimit is validated instead of the value persisted in memory_limit ini setting
  • we do want to have a failing test for those memory sizes you accidentally added to the valid ones, i.e. M256 and G1024
  • we do want to keep some of the other unit tests which are actually quite nice to have, I'd love to keep these

Did I forgot something? Let me know!

Could you probably refactor this PR to have only those changes in place? Happy to merge that once thats done (maybe not today but definitely in the next days). That would be perfect. Thanks in advance for contributing this!


I just added the test to "show" that it is currently wrong because it does return 256 (bytes). M256 and G1024 would go out of the test when we found a solution. Sorry, this was confusing from my side to put the two values into the data set.

Ah I see, well that escalated quickly 😂 However, lets try to focus on the initial problem. I'd just add a proper failing test for the invalid values. That should outline that the memory limit is not validated as it should be. Then we can have the fix in place and are good to go.

@leuchtdiode
Copy link
Author

leuchtdiode commented Mar 21, 2024

Okay, let me rephrase what we want to achieve with this PR:

  • we do want to fix normalizeMemoryLimit method so that the value passed to setMemoryLimit is validated instead of the value persisted in memory_limit ini setting
  • we do want to have a failing test for those memory sizes you accidentally added to the valid ones, i.e. M256 and G1024
  • we do want to keep some of the other unit tests which are actually quite nice to have, I'd love to keep these

Did I forgot something? Let me know!

Could you probably refactor this PR to have only those changes in place? Happy to merge that once thats done (maybe not today but definitely in the next days). That would be perfect. Thanks in advance for contributing this!

Yes, that's correctly summarized.

Point 1: Already done.
Point 2: I will change
Point 3: Does anyone have a problem with the null check in normalizeMemoryLimit This would let the existing tests pass.

…i's memory limit as default value

Signed-off-by: Alex Schloegl <alex@try2catch.com>
Signed-off-by: Alex Schloegl <alex@try2catch.com>
@leuchtdiode
Copy link
Author

leuchtdiode commented Mar 21, 2024

I adapted the things we discussed. Please have a look if that suits you. Maybe you can double check the regex which I had to adapt to disallow M128, G1024, etc.
That i hopefully did not oversee something :)

The code formatting makes me crazy. I have to check how to apply phpcs.xml in PHPStorm 😂

@boesing
Copy link
Member

boesing commented Mar 21, 2024

Does anyone have a problem with the null check in normalizeMemoryLimit This would let the existing tests pass.

🤚🏼
Can you point me to the test which is failing so that I can have a look into that one before merging?

*/
protected function normalizeMemoryLimit($value)
{
if (is_numeric($value)) {
return (int) $value;
}

if (! preg_match('/(\-?\d+)\s*(\w*)/', ini_get('memory_limit'), $matches)) {
throw new Exception\InvalidArgumentException("Invalid memory limit '{$value}'");
if (! preg_match('/^(\-?\d+)\s*(\w*)$/', $value, $matches)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm fine with this one. I would say lets remove the $ for now.
I still wonder why a negative numeric value would be okay here but who am I to judge this ancient code.
Most probably only @weierophinney or @marc-mabe are aware of that and thus lets keep it for now. Could be removed in next major.
I most probably keep that in mind until then.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

negative numeric is for -1 = no limit

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, but thats how it worked before as well.
If s1 wants to set unlimited memory, thats fine in the current major as thats how it works since like forever.
I am open to discuss this in an RFC for the next major version, would you agree on that?

Copy link
Member

@boesing boesing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think after dropping the $ in the regex, this PR is ready to get merged.

Signed-off-by: Alex Schloegl <alex@try2catch.com>
@leuchtdiode
Copy link
Author

I think after dropping the $ in the regex, this PR is ready to get merged.

Done :)

@Xerkus
Copy link
Member

Xerkus commented Mar 21, 2024

Why $ is dropped from regex? it should match full string
oof. Regex had no full match constraints? it could match weirdest stuff... so much for input validation

@boesing boesing changed the title Bugfix: Use given parameter instead of ini_get('memory_limit') when n… Validate memory_limit option Mar 22, 2024
@boesing boesing self-assigned this Mar 22, 2024
@boesing
Copy link
Member

boesing commented Mar 22, 2024

Why $ is dropped from regex? it should match full string oof. Regex had no full match constraints? it could match weirdest stuff... so much for input validation

Because it was introduced in this patch. Passing something like 1000 K YALLA is parsed as 1000 and K. I am not even super happy with the ^ because you could have passed YALLA 1000 K YALLA and it worked. But I do not really think that this might introduce too much noise but something like 1000 K (see that whitespace at the end?) should still be parsable (for now).

I think in next major, we should do something like:

if (is_numeric($value)) {
    return (int) $value;
}

if (! preg_match('/^(?<value>\-?\d+)\s+?(?<unit>[GMK])?$/i', trim($value), $matches)) {
    throw new Exception\InvalidArgumentException(sprintf(
        'Invalid memory limit "%s" passed. Must be compatible with PHP shorthand bytes notation',
        $value,
    ));
}

$value (int) $matches['value'];
$unit = strtoupper($matches['unit']);

if ($value < -1) {
    throw new Exception\InvalidArgumentException('Provided memory limit must represent a value greater than or equal to `-1`');
}

return match ($unit) {
   'G' => $value * 1024 * 1024 * 1024,
   'M' => $value * 1024 * 1024,
   'K' => $value * 1024,
   default => $value,
};

Fine for every1?

Signed-off-by: Alex Schloegl <alex@try2catch.com>
@leuchtdiode
Copy link
Author

Why $ is dropped from regex? it should match full string oof. Regex had no full match constraints? it could match weirdest stuff... so much for input validation

Because it was introduced in this patch. Passing something like 1000 K YALLA is parsed as 1000 and K. I am not even super happy with the ^ you could have passed YALLA 1000 K YALLA and it worked. But I do not really think that this might introduce too much noise but something like 1000 K (see that whitespace at the end?) should still be parsable.

I think in next major, we should do something like:

if (is_numeric($value)) {
    return (int) $value;
}

if (! preg_match('/^(?<value>\-?\d+)\s+?(?<unit>[GMK])?$/i', trim($value), $matches)) {
    throw new Exception\InvalidArgumentException(sprintf(
        'Invalid memory limit "%s" passed. Must be compatible with PHP shorthand bytes notation',
        $value,
    ));
}

$value (int) $matches['value'];
$unit = strtoupper($matches['unit']);

if ($value < -1) {
    throw new Exception\InvalidArgumentException('Provided memory limit must represent a value greater than or equal to `-1`');
}

return match ($unit) {
   'G' => $value * 1024 * 1024 * 1024,
   'M' => $value * 1024 * 1024,
   'K' => $value * 1024,
   default => $value,
};

Fine for every1?

Sounds good to me. I will add "-1" and -1 to the valid data set.

Signed-off-by: Alex Schloegl <alex@try2catch.com>
@boesing boesing added this to the 2.3.1 milestone Mar 22, 2024
@boesing boesing added Bug Something isn't working and removed Enhancement labels Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working
Projects
None yet
4 participants