Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to extend ProcessSlave and/or bind to the event loop #463

Open
mariusbalcytis opened this issue Aug 25, 2019 · 3 comments
Open

Ability to extend ProcessSlave and/or bind to the event loop #463

mariusbalcytis opened this issue Aug 25, 2019 · 3 comments

Comments

@mariusbalcytis
Copy link

One of the things php-pm allows to do, which simple php-fpm doesn't, is to share the state between the requests. Of course, this shouldn't be overused, but is caching things in-memory is quite possible use-case, and is fastest approach possible (faster than using external storage, like Redis or Memcached).
One thing which could be improved over straight-forward implementation is cache invalidation. This could be done using one of React libraries, like predis-async or RabbitMQ integration – it does not add any latency to the requests and cache is invalidated in all the workers as soon as this is needed.

What's missing to implement this properly with current core is ability to access the event-loop that's used by php-pm. It's inside ProcessSlave class, which is not configurable, and is not passed anyhow to the bridge.

Implementing this is possible, but quite hackish. Currently the ProcessSlave instance creation is hard-coded inside local variable in quite long method body (\PHPPM\ProcessManager::newSlaveInstance). So, it's possible to extend ProcessManager, overwrite the method, copy the contents and change only that single place (classname of ProcessSlave), but this makes it quite hard to maintain with the base code if anything would change in the core. Also it needs quite many copy-paste from ProcessSlave, as run method needs to be changed, which calls 2 other private methods, which in turn have to also be copy-pasted. Well, you get the drill.

A bit simpler (but probably even more hackish) approach is to get ProcessSlave::$slave and get $loop by using reflection and making it public.

I assume there could be several approaches how to achieve this:

  1. Make ProcessSlave class configurable (noted in A way to override ProcessManager and ProcessSlave #216, but not followed upon). This would also need some protected method for a hook to bind the $this->loop inside run method.
  2. Change \PHPPM\ProcessSlave::run method to call some outside class to store the $loop inside, let's say, static publicly accessible variable.
  3. Expose the $loop variable with public function. It could be accessed by calling ProcessSlave::$slave->getLoop().

Current PoC I have set up locally:

class ProcessSlave
{
    // ...
    public function run()
    {
        $this->loop = Factory::create();

        $this->errorLogger = BufferingLogger::create();
        ErrorHandler::register(new ErrorHandler($this->errorLogger));

        \EventLoopProvider::registerLoop($this->loop); // this line was added

        $this->tryConnect();
        $this->loop->run();
    }
    // ...
}

This is just a wrapper to avoid using static context in various places. Expected to be service in Symfony DIC. Could also provide methods for getting if we're even running in the php-pm context.

class EventLoopProvider
{
    private static $staticEventLoop;
    private $eventLoop;

    public function __construct()
    {
        $this->eventLoop = self::$staticEventLoop;
    }

    public static function registerLoop(LoopInterface $eventLoop)
    {
        self::$staticEventLoop = $eventLoop;
    }

    public function getEventLoop()
    {
        return $this->eventLoop;
    }
}

Example of storage with automatic cache invalidation. Expects for message to be published, but could also use Keyspace notifications if enabled.

class RedisStorage
{
    private $client;
    private $asyncClient;

    private $values = [];

    public function __construct(EventLoopProvider $eventLoopProvider)
    {
        $this->client = new \Predis\Client('tcp://redis:6379');
        $this->asyncClient = new \Predis\Async\Client('tcp://redis:6379', $eventLoopProvider->getEventLoop());
    }

    public function get(string $key)
    {
        if (!isset($this->values[$key])) {
            $this->loadValue($key);
        }

        return $this->values[$key];
    }

    private function loadValue(string $key)
    {
        $value = $this->client->get($key);
        $this->values[$key] = $value;

        if (!$this->asyncClient->isConnected()) {
            $this->asyncClient->connect(function () use ($key) {
                $this->subscribe($key);
            });
        } else {
            $this->subscribe($key);
        }
    }

    private function subscribe(string $key)
    {
        $this->asyncClient->pubSubLoop('changed:' . $key, function () {
            unset($this->values[$key]);
        });
    }
}

Is this something you would consider to make available? I could make a pull-request, but I wanted to check the preferred approach, if any of them makes sense.

@andig
Copy link
Contributor

andig commented Aug 25, 2019

I think we used to have an interface for that on the bootstrap but removed it as it was lacking an actual use case. If you have a concrete suggestion how to implement this in a clean way I'd be happy to take a PR.

@dnna
Copy link
Contributor

dnna commented Aug 25, 2019

Personally, I'm not a fan of the static variable approach.
I think injecting it to the bootstrap or bridge would be a good approach (maybe via a setLoop function and a new interface so we don't break BC), as what the actual cache behavior would be is application-specific from what I understand.

@acasademont
Copy link
Contributor

Hi @mariusbalcytis I'm wondering what's the real usefulness of having access to the loop if the slave will not be able to serve other requests in parallel. The idea behind the loop is that you set up a callback so that the loop can do other stuff. In our ppm world, as we assume the php code might have blocking calls, we block the slave for the whole request, deferring tasks to the loop would be pretty useless, every slave has its own loop.

Also, yes, the cache is shared but per-slave, if request 1 goes to slave 1 and request 2 goes to slave 2 they won't have the same shared memory, they are 2 different processes. I wouldn't count on using that a lot, it could create some hard to debug problems when trying to invalidate it, as you said. For example, if using Doctrine, the recommended approach is to clear the entity manager at the end of the request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants