Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Background process runner #512

Open
wants to merge 49 commits into
base: feature/regenerate-existing-images
Choose a base branch
from

Conversation

ankitrox
Copy link

@ankitrox ankitrox commented Sep 5, 2022

Summary

Create a background process runner class. This is mainly responsible for:

  1. Handling the background job call via ajax action.
  2. Validating the parameters.
  3. Running the process.
  4. Watching the process for time and memory exhaustion.
  5. Terminating the process.
  6. Error handling.

Fixes #482 #493 #494

Relevant technical choices

Checklist

  • PR has either [Focus] or Infrastructure label.
  • PR has a [Type] label.
  • PR has a milestone or the no milestone label.

@ankitrox ankitrox added [Focus] Images Issues related to the Images focus area no milestone PRs that do not have a defined milestone for release [Type] Epic A high-level project / epic that will encompass several sub-issues [Module] Regenerate Existing Images Issues for the Regenerate Existing Images module labels Sep 5, 2022
@ankitrox ankitrox self-assigned this Sep 5, 2022
Copy link
Contributor

@jjgrainger jjgrainger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox Looking good so far. I've left a couple of comments to address.

Additionally, there are a couple of other things. One is the failing tests on this PR the second is the inclusion of the Job class which is also in #507 and should not be present in this PR.

*
* @var string Background process action name.
*/
const BG_PROCESS_ACTION = 'background_process_handle_request';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox we should prefix this action name with perflab_

Suggested change
const BG_PROCESS_ACTION = 'background_process_handle_request';
const BG_PROCESS_ACTION = 'perflab_background_process_handle_request';

$max_execution_time = ( ! empty( $time ) && ( $time > $min_execution_time ) ) ? $time - 10 : $min_execution_time;
}

return ( $current_time >= ( $run_start_time + $max_execution_time ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox we should also use apply_filters on the value returned here as we have with the memory limit. The filter should be perflab_background_process_time_exceeded

Suggested change
return ( $current_time >= ( $run_start_time + $max_execution_time ) );
$time_exceeded = ( $current_time >= ( $run_start_time + $max_execution_time ) );
return apply_filters( 'perflab_background_process_time_exceeded', $time_exceeded );

* @return void
*/
public function handle_request() {
try {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox I feel like this method and try block does too much? I think we can split some of the logic between the handle_request() and run() methods. I think this because we have exceptions thrown that don't get caught as the job is not available.

The way I see it, handle_request() should verify and determine the request is valid. If not, the method should return early, or maybe even return a wp_send_json_error error message? If the request is valid it then calls the run() method,

Then the run method is responsible for running the job and I think this could contain everything from the $this->job->lock() line down.

* @param array $data Job data.
*/
do_action( 'perflab_job_' . $this->job->get_name(), $this->job->get_data() );
} while ( ! $this->memory_exceeded() && ! $this->time_exceeded() );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox we should also check if the job is marked as complete here too.

Suggested change
} while ( ! $this->memory_exceeded() && ! $this->time_exceeded() );
} while ( ! $this->memory_exceeded() && ! $this->time_exceeded() && ! $this->job->is_completed() );

* Do not call the background process from within the script if the
* real cron has been setup to do so.
*/
if ( defined( 'ENABLE_BG_PROCESS_CRON' ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox I think the configuration we have defined so far is to disable this background process entirely with a DISABLE_BACKGROUND_PROCESS constant. I don't believe we have the infrastructure defined to have the WP Cron handle the additional batches.

I think we should remove this here and consider where the DISABLE_BACKGROUND_PROCESS check would need to be made.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will also have this constant to let the real cron job call background process. In such cases, script would not call next batch of the job from within itself.
This is discussed with @felixarntz in alternate approaches to improve background process workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ankitrox I'm not sure how this works in the grand scheme of things as we haven't defined or built the part that handles subsequent batches. I think we can approach this change in 1 of 2 ways.

  1. Keep the constant as part of this PR, but make a change as this is used twice in the code, once here and again inside the perflab_dispatch_background_process_request. I think we should aim to only have to check this constant once.

  2. Remove the constant from the PR and we can add it as part of a new PR and stream of work to introduce this functionality.

I think in both cases we should also create a new GitHub issue with acceptance criteria to outline the work.

cc @felixarntz

@ankitrox
Copy link
Author

@ankitrox Looking good so far. I've left a couple of comments to address.

Additionally, there are a couple of other things. One is the failing tests on this PR the second is the inclusion of the Job class which is also in #507 and should not be present in this PR.

@jjgrainger Let's aim to get background class PR merge first. As process runner is using job instance, I have merged it in this PR. Once background job class is merged, we can get this PR ready for review to others.

@jjgrainger
Copy link
Contributor

@ankitrox I think this PR is becoming too large and tackling too many parts of the background process.

For starters, I would argue the work for status cron #484 can be moved out of the background process runner and into its own PR/file/class. I think this should definitely be moved out of the PR and into a new one focused on fixing that specific issue.

Additionally, this PR seems to also fix #480 #481 and even #490 #483, which were already being addressed in other PR's. We want to keep these PR's smaller to make the review process more focused and less complex.

Can you update this PR to focus on the acceptance criteria in #482 #493 #494.

Let me know if you have any questions.

@ankitrox ankitrox marked this pull request as ready for review October 5, 2022 10:40
@ankitrox
Copy link
Author

ankitrox commented Oct 5, 2022

@ankitrox I think this PR is becoming too large and tackling too many parts of the background process.

For starters, I would argue the work for status cron #484 can be moved out of the background process runner and into its own PR/file/class. I think this should definitely be moved out of the PR and into a new one focused on fixing that specific issue.

Additionally, this PR seems to also fix #480 #481 and even #490 #483, which were already being addressed in other PR's. We want to keep these PR's smaller to make the review process more focused and less complex.

Can you update this PR to focus on the acceptance criteria in #482 #493 #494.

Let me know if you have any questions.

Thanks @jjgrainger
I have removed the cron related code from PR. I will create a new PR for that one.
Also, updated the dependent issue in PR description.

Copy link
Contributor

@jjgrainger jjgrainger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ankitrox left some comments but looking good so far

public function __construct() {
// Handle job execution request.
add_action( 'wp_ajax_' . Perflab_Background_Process::BG_PROCESS_ACTION, array( $this, 'handle_request' ) );
add_action( 'wp_ajax_nopriv_' . Perflab_Background_Process::BG_PROCESS_ACTION, array( $this, 'handle_request' ) );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox we should remove this action as we do not want unauthenticated requests triggering the background process. We should stick with the wp_ajax_perflab_background_process_handle_request action only.

*
* @param int $job_id Job ID.
*/
function perflab_dispatch_background_process_request( $job_id ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox based on the acceptance criteria in #480 this function should be named perflab_start_background_job

return;
}

perflab_dispatch_background_process_request( $job_id );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox we are handling next batch requests differently to the start request. See #490 for the AC's here. The function name that should be called is perflab_background_process_next_batch.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's only a difference of $key in start_job and next_batch functions. I think we can somehow combine/compose these functions to avoid rewriting entire thing twice.

* Do not call the background process from within the script if the
* real cron has been setup to do so.
*/
if ( defined( 'ENABLE_BG_PROCESS_CRON' ) ) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @ankitrox I'm not sure how this works in the grand scheme of things as we haven't defined or built the part that handles subsequent batches. I think we can approach this change in 1 of 2 ways.

  1. Keep the constant as part of this PR, but make a change as this is used twice in the code, once here and again inside the perflab_dispatch_background_process_request. I think we should aim to only have to check this constant once.

  2. Remove the constant from the PR and we can add it as part of a new PR and stream of work to introduce this functionality.

I think in both cases we should also create a new GitHub issue with acceptance criteria to outline the work.

cc @felixarntz

*
* @throws Exception Invalid request for background process.
*/
public function handle_request() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ankitrox I think this method does too much and should only be responsible for handling the HTTP request and calling the run() method. I would make the following changes:

  • Remove the try/catch block and exceptions - as we're handling the HTTP request we can return early with a error response. This could be a combination of using WP_Error instead of Exception and returning it with wp_send_json_error
  • The final part of the handle_request() method should be the $this->run() call.
  • The unlock() and next_batch() lines should be moved to the end of the run() method. Reasoning here is that this logic is part of the job running processes. If we were to trigger running a job via a different method (WP_Cron) calling $this->run() would still work as expected.
  • If you wanted you can use a try/catch block inside the run() method to catch any system errors while jobs are running and handle those properly.

Let me know if this makes sense, happy to discuss on a call

Copy link
Author

@ankitrox ankitrox Oct 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made the proposed changes.
I think adding try catch to run method would make more sense when we will implement error handling.

@ankitrox
Copy link
Author

ankitrox commented Oct 5, 2022

Thanks @ankitrox I'm not sure how this works in the grand scheme of things as we haven't defined or built the part that handles subsequent batches. I think we can approach this change in 1 of 2 ways.

Keep the constant as part of this PR, but make a change as this is used twice in the code, once here and again inside the perflab_dispatch_background_process_request. I think we should aim to only have to check this constant once.

Remove the constant from the PR and we can add it as part of a new PR and stream of work to introduce this functionality.

I think in both cases we should also create a new GitHub issue with acceptance criteria to outline the work.

cc @felixarntz

@jjgrainger Regarding this comment, I think option #2 is better; to handle this in separate issue and PR.

@adamsilverstein
Copy link
Member

@ankitrox looks like this PR has gone stale, can we go ahead and close it?

Copy link

github-actions bot commented May 24, 2024

The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the props-bot label.

Unlinked Accounts

The following contributors have not linked their GitHub and WordPress.org accounts: @ankit.gade@rtcamp.com, @ankitrox.

Contributors, please read how to link your accounts to ensure your work is properly credited in WordPress releases.

If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message.

Unlinked contributors: ankit.gade@rtcamp.com, ankitrox.

Co-authored-by: jjgrainger <joegrainger@git.wordpress.org>
Co-authored-by: adamsilverstein <adamsilverstein@git.wordpress.org>
Co-authored-by: westonruter <westonruter@git.wordpress.org>
Co-authored-by: felixarntz <flixos90@git.wordpress.org>
Co-authored-by: eugene-manuilov <manuilov@git.wordpress.org>
Co-authored-by: bethanylang <mxbclang@git.wordpress.org>

To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook.

Copy link
Member

@adamsilverstein adamsilverstein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this still needed?

@westonruter
Copy link
Member

@adamsilverstein looking back at old PRs from this time period, it seems the main push was to add image regeneration as a background process which this PR would serve as infrastructure for. It seems this topic hasn't been on the radar for quite some time, so potentially several such PRs should be closed. Not sure why that got deprioritized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
[Focus] Images Issues related to the Images focus area [Module] Regenerate Existing Images Issues for the Regenerate Existing Images module no milestone PRs that do not have a defined milestone for release [Type] Epic A high-level project / epic that will encompass several sub-issues
Projects
None yet
4 participants