Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Report individual crawl errors in issues #1131

Open
tidoust opened this issue Jan 26, 2024 · 0 comments
Open

Report individual crawl errors in issues #1131

tidoust opened this issue Jan 26, 2024 · 0 comments

Comments

@tidoust
Copy link
Member

tidoust commented Jan 26, 2024

Via #1130.

When crawl on a spec fails, the crawler records the error in an error property in ed/index.json and reuses previous extracts. In some cases, failure is transient, e.g., due to a network hiccup. In other cases though, the error is more permanent, e.g., because the extraction logic bumps into unexpected markup.

The more permanent errors may go unnoticed for some time, because nothing notifies us about the problem. Code should report these errors in an issue (and ideally close the issue if the problem disappears).

Side note: when the crawler crashes completely, the job fails, no need to handle that, GitHub already sends email notifications.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant