Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feature: Allow to filter jobs in ZyteJobsComparisonMonitor by close_reason #434

Open
curita opened this issue Mar 7, 2024 · 1 comment
Assignees

Comments

@curita
Copy link
Member

curita commented Mar 7, 2024

Background

ZyteJobsComparisonMonitor is a monitor that compares item_scraped_count from the current job against past jobs from ScrapyCloud.

By default, it will grab #SPIDERMON_JOBS_COMPARISON past jobs, and these jobs can be filtered using:

  • SPIDERMON_JOBS_COMPARISON_STATES (only keep jobs that have those states)
  • SPIDERMON_JOBS_COMPARISON_TAGS (only keep jobs with those tags if they are present in the current job too)

This works, but it's common to need more filtering options in practice.

Particularly, it would be nice to add to the filter by close_reason rather than state.

state can be "finished," "running," "pending," or "deleted," which is not quite helpful, as we mostly want to check "finished" jobs. close_reason, on the other hand, could allow us to keep only successful finished past jobs instead of failed or banned ones, which can have truncated item counts we wouldn't want to compare the current job against.

Proposal

Add a new SPIDERMON_JOBS_COMPARISON_CLOSE_REASONS setting to allow ZyteJobsComparisonMonitor to filter by the ScrapyCloud jobs' close_reason stat.

@shafiq-muhammad
Copy link
Contributor

Hello, I would like to work on this feature.

@shafiq-muhammad shafiq-muhammad self-assigned this Mar 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants