-
Notifications
You must be signed in to change notification settings - Fork 672
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use of include()
et al for data files is not compatible with PHP opcache
#1708
Comments
I agree that we should minimize the use of Your best bet is to exclude the |
Indeed as @tvdijen mentioned it's not an option to replace these files with plain configuration files because precisely one of the flexibilities of SimpleSAML is that the configuration can contain code where needed. It seems you've disabled We could consider to call |
Oh, I didn't even know about The specific issue I used as an example should be fixed by calling Are there other known files that would be often automatically modifed and re-loading the correct version is important? (A poor man's fix would be to call |
As far as I'm aware metarefresh does this and maybe another module or so, but it's not very common. |
I noticed this issue with version 1.19.4 which already includes that call here https://github.com/simplesamlphp/simplesamlphp/blob/v1.19.4/lib/SimpleSAML/Utils/System.php#L236 so this is not enough to fix the problem for metadata refresh. I was running PHP 7.4 in case it makes a difference. |
Also note that according to comments in https://www.php.net/manual/en/function.opcache-invalidate.php the As such, I think it's okay to use However, when SimpleSAML creates files in automated way, there should be zero need to make those as executable files and you should save and read real data instead. JSON is a good option because it doesn't have similar possibility to accidentally execute unintended PHP code like |
I don't think this is true. If the entry is invalidated that file will be re-read and added to the cache under that name. So it will not fill up since the filenames will be the same. This problem would only exist if you want to use |
Indeed and this was added precisely to enable the metarefresh use case (cd8179d). So probably we should find out why this didn't solve the problem for you. |
I agree that this would be possible for this specific use case. I just want to clarify why in general the config files are now in this way so that if we want to write out config automatically we now use this format. I believe a solution here is possible but for reasons stated before I would not want to replace all config with static files because this loses essential flexibility. We could make use of static files as an addition to the interpreted ones. For this case you could use the already present MetadataStoragehandlerSerialize which metarefresh supports and SimpleSAML supports aswell. If this somehow not suffices one option is to add e.g. also a JSON flatfile metadata storage handler which should not be hard. The only thing we would then need to change is the format of the statefile which metarefresh can decide to change on its own. |
Maybe this issue is not the place to discuss this, but my PoC moves that flexibility to a configuration class. If you want to do something custom/dynamically you can extend that class, overload any method to do your magic, and add a class alias so SSP knowns to use your custom class instead of the default one. I'd be happy to hear what you think of the PoC code. |
I think PHP opcode cache implementation has very simple memory management routine which basically allocates new memory range for each recompilation of a file and it simply marks old version as invalidated. It may be able to re-use the old memory range if the new version is smaller in opcode form. There's no logic to defrag the opcode cache so with many enough reloads, the opcode cache will be so fragmented that new version of the file can no longer be stored anywhere. Basically opcode cache as one big block of RAM and it puts compiled versions of file A, B and C in sequence in the RAM. When B is later updated and the updated version (one byte longer) needs to be written in the cache, the end result will be A, B, C, B' where B is the the old version invalidated and B' is the new version which didn't fit in the same space as B. If later a small enough file D is added to cache, it will be placed as A, D, C, B' and the RAM between end of D and start of C is basically lost to memory fragmentation. |
SimpleSAML PHP code contains lots of code that writes new PHP code in various files and then tries to load those files using
include()
(or some variant of it likerequire_once()
). This is not a safe way to load data files because fully enabledopcache
will not reload the files from disk after reading the files once.Steps to reproduce the behavior:
Run SimpleSAML on Apache
mod_php
module to avoid restarting PHP process all the time.Configure Apache
mod_php
module to see followingphp.ini
settings:Configure e.g. federation metadata refresh to happen via Cron calling
.../cron/cron.php
through Apachemod_php
.Expected behavior
The metarefresh cron task should be able to successfully update federation metadata.
Actual behavior
The metarefresh cron task fails to update the federation metadata correctly because one of the following reasons:
.../data/metarefresh-state.php
contains stale data because PHP opcache will remember the first time contents for that file..../metadata/$IDPKEY/saml20-idp-remote.php
may be updated but never re-read by Apache childs so it will appear to contain the data it was read the first time since starting the Apache process.Obviously, the first fetch of metadata will work because in that case no previous files exist so those cannot be cached by
opcache
, so this bug is about refreshing the data.Additional information
The same logic is broken everywhere where files are written to filesystem and
include()
is used to attempt to read those again. When PHPopcache
is fully enabled like above PHP code cannot assume thatinclude()
will actually re-read any files from the filesystem. The code must use e.g.file_get_contents()
or some other functions that actually read files instead of blindly assuming thatinclude()
will do the trick.In addition, use of
include()
causes huge potential security risk because any encoding bug in the process that writes new PHP files will be turned in server side code execution. This risk is not worth taking for data files.Expecting
include()
to always reload the file from the filesystem is not a safe assumption in PHP!The
opcache.enable_cli
is not needed to reproduce this with Apache, but using that may be good enough to make this reproducible via automated tests.Workaround for the current implementation
To get SimpleSAML metadata refresh to work as designed while using fully tuned
opcache
, one must executegraceful restart
of Apache, execute the trigger to runcron
hooks and then re-executegraceful restart
to make changes made by cron hooks visible to all SimpleSAML logic. The graceful restart before the cron hook is required to make sure that the PHP code sees the latest version of file.../data/metarefresh-state.php
so that the conditional GET is correctly executed.Another alternative is to change the PHP config to always revalidate files from the filesystem before using the PHP
opcache
but since this is Apache wide setting, any other services running on the same PHP server will take the performance hit from this change. As such, this is acceptable only if the served traffic is low enough to make this better solution than executing Apache graceful restart before and after the cron hooks.The text was updated successfully, but these errors were encountered: