Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle different error_code coming from RUCSS insted of just 408 #5963

Open
mostafa-hisham opened this issue Jun 5, 2023 · 8 comments · May be fixed by #5964
Open

Handle different error_code coming from RUCSS insted of just 408 #5963

mostafa-hisham opened this issue Jun 5, 2023 · 8 comments · May be fixed by #5964
Assignees
Labels
effort: [S] 1-2 days of estimated development time module: remove unused css priority: low Issues that can wait type: enhancement Improvements that slightly enhance existing functionality and are fast to implement

Comments

@mostafa-hisham
Copy link
Contributor

RUCSS is going to send different codes instead of only 408 if the page requires a new RUCSS regeneration
issue in RUCSS

407->Protocol error (Target.setAutoAttach): Session closed
408->Navigation timeout
409->Browser couldn't open a new page

We need to change the code to handle the new error codes

@mostafa-hisham mostafa-hisham self-assigned this Jun 5, 2023
@piotrbak piotrbak added type: enhancement Improvements that slightly enhance existing functionality and are fast to implement module: remove unused css labels Jun 5, 2023
@MathieuLamiot
Copy link
Contributor

Hi @mostafa-hisham, I would suggest to take into account Gustav's proposal not to use custom HTTP error codes: Even if it does the job for the feature, it can have bad impact on Ops monitoring (they usually monitor service performances based on standardized error codes) and it would not be a good practice.

The goal of the task is to get more details on the errors in the metric system, right? Can the metric system access something else than the "code" field of the response? For instance, we could monitor the "message" field? That could be a good quick win without having to modify the error code mechanism, which will be a bit more tricky (on that topic, which can be dealt with later on, I'll open a dedicated conversation on slack to sort it out).

@MathieuLamiot
Copy link
Contributor

Moving issue to "Blocked" while we figure out a solution.

@MathieuLamiot
Copy link
Contributor

discussion on-going here

@MathieuLamiot
Copy link
Contributor

Following Slack discussion, moving back to "To Grooming" as needs and spec. evolved. Here are the

  • Must provide an error categorization system to ease operations on failed tasks for new versions of WP Rocket.
  • Must remain backward compatible with older WP Rocket versions, especially the "reschedule on error" feature.
  • Must provide an error categorization system that can be easily refined without breaking backward compatibility.
  • Should refactor the error categorization system to avoid relying on error codes too similar to HTTP response codes.

Several approaches were mentioned and should be investigated during grooming:

  • Send the WP Rocket version when scheduling a job, so that RUCSS can handle backward compatibility.
  • Migrate from error codes being numbers (like 500, 408) to an enum of standardized error reasons.
  • Return a "could_retry" field in case of errors on RUCSS to notify the plugin that the job should be re-scheduled, when it is the case (previously done with 408)

@mostafa-hisham mostafa-hisham added effort: [S] 1-2 days of estimated development time and removed needs: grooming labels Jul 31, 2023
@mostafa-hisham
Copy link
Contributor Author

@MathieuLamiot
@piotrbak
@DahmaniAdame
I think that sending the WPR version to RUCSS and changing the response based on that is going to be a lot of work and will make the code more complicated

Scope a solution

I thinkcould_retry flag is a good idea we could send it back with 408 status for the next upcoming versions as my discussion with @MathieuLamiot in Slack and modify the plugin code to check both 408 or the flag and store whatever message or error code we have
then after that we can decide how the error_code we send back should look and the plugin will not depend on them to retry sending a new job but will depend on the could_retry flag

we need to send back the could_retry flag with each 408 response in the following
https://github.com/wp-media/nodejs-treeshaker/blob/c9fd859c46c5ec5d981abcbb16b47594e92a2223/rucss/libs/rucss-manager.js#L73

https://github.com/wp-media/nodejs-treeshaker/blob/c6a67058b3889ee794b9e21866c8bbda16ab23b7/rucss/libs/puppeteer-manager.js#L267

https://github.com/wp-media/nodejs-treeshaker/blob/dbd2f0670d72eaab162d3fc8f967c43a42948853/rucss/processor.js#L77

and on the plugin side, we should change the code to check for the could_retry or status 408 to retry new jobs here

switch ( $job_details['code'] ) {

Estimate the effort

Effort S

@MathieuLamiot MathieuLamiot changed the title Handle different error_code coming from RUCSS insted of just 408 Part 1: Handle different error_code coming from RUCSS insted of just 408 Jul 31, 2023
@MathieuLamiot MathieuLamiot changed the title Part 1: Handle different error_code coming from RUCSS insted of just 408 Handle different error_code coming from RUCSS insted of just 408 Jul 31, 2023
@MathieuLamiot
Copy link
Contributor

MathieuLamiot commented Jul 31, 2023

The solution proposed by @mostafa-hisham is a first step so that we can later on implement error categorization. I created this task as a follow-up/placeholder to actually implement the error categorization if needed.

@CrochetFeve0251
Copy link
Contributor

I added the proposition into an issue from the RUCSS as I detail technical parts inside it and this issue is one from wpr aka a public one.
https://github.com/wp-media/nodejs-treeshaker/issues/421

@piotrbak piotrbak added the priority: medium Issues which are important, but no one will go out of business. label Aug 1, 2023
@mostafa-hisham
Copy link
Contributor Author

We are going to do what @CrochetFeve0251 said here

@MathieuLamiot At the end we agreed to go on a mix of both solution.

@mostafa-hisham make me noticed the error code weren't HTTP standards code so using them is a bad pattern. Due to that we decided to use the flag as output could_retry value in the final response.

On another hand to keep the change simple for the moment we decided to not touch the business logic inside the worker and continue to return 408 for the moment. That also make sense as the retry flag doesn't seems to part of the business logic but more an use case for the plugin needs and modify only the endpoint to adapt the current 408 error response to add the flag seems more appropriate.

For that inside the handler we will add the following logic:

.then((data) => {
                            if ( data.returnvalue.code === 408) {
                             data.returnvalue.could_retry = data.error;
                           }     
                           return data;
                        })

and on the plugin side, we should change the code to check for the could_retry or status 408 to retry new jobs here

switch ( $job_details['code'] ) {

Estimate the effort

Effort S

@piotrbak piotrbak added priority: low Issues that can wait and removed priority: medium Issues which are important, but no one will go out of business. labels Sep 21, 2023
@piotrbak piotrbak added this to the 3.16 milestone Sep 21, 2023
@piotrbak piotrbak removed this from the 3.16 milestone May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
effort: [S] 1-2 days of estimated development time module: remove unused css priority: low Issues that can wait type: enhancement Improvements that slightly enhance existing functionality and are fast to implement
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants