Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cloud tasks: Deadline exceeded -- retry settings not working #6814

Open
mat105 opened this issue Nov 24, 2023 · 6 comments
Open

Cloud tasks: Deadline exceeded -- retry settings not working #6814

mat105 opened this issue Nov 24, 2023 · 6 comments
Assignees
Labels
api: cloudtasks Issues related to the Cloud Tasks API.

Comments

@mat105
Copy link

mat105 commented Nov 24, 2023

I have this code, however the retrySettings doesn't seem to be working (however they get validated), i keep getting constant DEADLINE_EXCEEDED errors, apparently there is a 10 second timeout somewhere, how can i fix this problem??

Environment details

  • OS: Linux
  • PHP version: 8.1
  • Package name and version: google/cloud-tasks 1.9.1

Steps to reproduce

  1. Create a task with retrySettings doesn't retry, it timeouts after 10 seconds.

Code example

<?php

namespace App\Extensions;

use Google\Cloud\Tasks\V2\CloudTasksClient;
use Google\Cloud\Tasks\V2\HttpMethod;
use Google\Cloud\Tasks\V2\HttpRequest;
use Google\Cloud\Tasks\V2\Task;
use Illuminate\Support\Facades\Log;

class Tasks {

    public $projectId;
    public $locationId;
    public $client;

    public $queueId;
    public $url;

    public $queueName;
    public $httpRequest;
    public $response;

    public function __construct(string $projectId, string $locationId) {
      $this->projectId = $projectId;
      $this->locationId = $locationId;
      // Instantiate the client and queue name.
      $this->client = new CloudTasksClient();
    }

    public function create_task(string $queueId, string $url, string $audience=null) {
      $this->queueId = $queueId;
      $this->url = $url;
      $this->queueName = $this->client->queueName(
        $this->projectId,
        $this->locationId,
        $this->queueId
      );

      $headers = [
        'Content-type'=>'application/json',
        'Accept'=>'application/json'
      ];

      if (isset($audience)) {
        $headers['Authorization'] = getIdentityToken($audience);
      }

      $this->httpRequest = new HttpRequest();
      $this->httpRequest->setUrl($this->url);
      $this->httpRequest->setHttpMethod(HttpMethod::POST);
      $this->httpRequest->setHeaders($headers);
    }

    public function set_options()
    {
      $options = [
        'initialRetryDelayMillis' => 100,
        'retryDelayMultiplier' => 1.3,
        'maxRetryDelayMillis' => 60000,
        'initialRpcTimeoutMillis' => 20000,
        'rpcTimeoutMultiplier' => 1.0,
        'maxRpcTimeoutMillis' => 20000,
        'totalTimeoutMillis' => 600000,
        'retryableCodes' => ['DEADLINE_EXCEEDED', 'UNAVAILABLE', 4, 14],
      ];
      return $options;
    }

    public function dispatch_task($payload) {
      $start = microtime(TRUE);
      if (isset($payload)) {
          $this->httpRequest->setBody($payload);
      }
      // Create a Cloud Task object.
      $task = new Task();
      $task->setHttpRequest($this->httpRequest);

      // Send request and print the task name.
      $this->response = $this->client->createTask($this->queueName, $task, ["retrySettings" => $this->set_options()]);
      $end = microtime(TRUE);

      Log::info("Created task {$this->url} {$this->response->getName()}", ["time" => ($end - $start)." seg"]);
    }
}

Error
imagen

Stack trace
imagen

@product-auto-label product-auto-label bot added the api: cloudtasks Issues related to the Cloud Tasks API. label Nov 24, 2023
@saranshdhingra saranshdhingra self-assigned this Dec 27, 2023
@saranshdhingra
Copy link
Contributor

Hi @mat105
Thanks for filling the issue.

A couple of things to note here.

As for today, we can't retry the exceptions which are of the type ApiException. This has been the standard behaviour for the client which use gax-php for a while.
This can be seen here.

Another thing that I would recommend is to use the retryable codes as constants like so:
retryableCodes => [Google\ApiCore\ApiStatus::DEADLINE_EXCEEDED].

There is an experimental option that we released recently in GAX version 1.25.0 where you can pass your own custom retry function, using that you should be able to take care of this use case.

However, please know that this behaviour might change in any future release as this is still experimental.

'retrySettings' => [
  // other options
  'retryFunction' => function (\Exception $e, $options) {
    // return true to retry, return false when you don't want a retry to take place.
  }
]

Another note of caution, please make sure you limit the retry behaviour by your own self checks as you might go into a loop of retrying which might incur cost.

@LukeAbell
Copy link

We're also hitting this issue. Is it possible to increase the 10 second timeout?

@saranshdhingra
Copy link
Contributor

It is possible that the DEADLINE_EXCEEDED exception is being received from the service itself.

From what I can see in the RetrySettings and RetryMiddleware in GAX, the only time we throw the ApiException with the DEADLINE_EXCEEDED code is when all the retries combined cross the timeout.

@LukeAbell Can you verify that the retry is in fact taking place, though?

@Rodrigo-JM
Copy link

I am facing a similar issue but is seems to be related to adding tasks in the queue, not the actual processing. When adding a lot of tasks, it can take more than 10s to finish the call, so the return is deadline exceeded.

@saranshdhingra saranshdhingra added the status: investigating The issue is under investigation, which is determined to be non-trivial. label Feb 6, 2024
@saranshdhingra
Copy link
Contributor

Hi @Rodrigo-JM
Could you please help me with the code snippet that you're using.

In my tests I tried to create tasks serially in the order of a 1000 and I didn't get any errors.

Then I tried simulating a post request of over a 100MB payload and I created 100s of such tasks serially and I wasn't able to recreate this issue.

I wish to understand that since createTask method only accepts a single Task instance, then the number of serial tasks shouldn't be a problem(unless this is specifically what you have observed?).

So, if there is a specific kind of task that is taking over 10 seconds to be added to the queue, then I would love some help to replicate this behaviour.

Meanwhile, @LukeAbell could you let me know if the retries are being triggered at all after passing the retrySettings.retryFunction option?

Thanks.

@saranshdhingra saranshdhingra removed the status: investigating The issue is under investigation, which is determined to be non-trivial. label Mar 7, 2024
@Rodrigo-JM
Copy link

I using GCP's JS SDK for Cloud Tasks. The problem is related to a change they made in the service responsible for enqueuing tasks . Now to solve this issue we have to select the older version by setting:

this.client = new CloudTasksClient({ fallback: true });

Some other implementation I did was enqueuing in batches, but not sure it was needed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: cloudtasks Issues related to the Cloud Tasks API.
Projects
None yet
Development

No branches or pull requests

4 participants