Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request: concurrency per host #527

Open
ManuelRauber opened this issue Nov 16, 2022 · 3 comments
Open

Feature Request: concurrency per host #527

ManuelRauber opened this issue Nov 16, 2022 · 3 comments

Comments

@ManuelRauber
Copy link

Hi there!

I've just switched from broken-link-checker to linkinator and I'm missing one little thing: to set the maximum concurrency per host.

Right know, I'm using Hugo and the Docsy theme to generate a GitHub hosted documentation. So, I've a lot of links going to GitHub.

Unfortunately, a lot of them will respond with 429. While linkinator retries them, after a certain amount it will eventually fail:

https://github.com/boundfoxstudios/community-project/edit/develop/docs/content/game-design-document/gameplay/player/index.md (from http://localhost:1313/game-design-document/gameplay/player/) -- reason: BROKEN http status: 429

It would be nice to have the possibility to limit how much concurrent requests are made to a host.

@JustinBeckwith
Copy link
Owner

This is something I've considered, but I really wonder out loud if the complexity would be worth it (as compared to the existing --concurrency property). How would you expect to use something like this from the command line? It would almost lead to needing to define these things in a string like:

$ linkinator website.com/page --host-concurrency "github.com 100" --host-concurrency "espn.com 100"

Even with that, it's unclear how the per-host concurrency would interact with the top level concurrency 😵 It's unclear to me that the code complexity and the config complexity that are really worth it.

@ManuelRauber
Copy link
Author

@JustinBeckwith

This is something I've considered, but I really wonder out loud if the complexity would be worth it (as compared to the existing --concurrency property). How would you expect to use something like this from the command line? It would almost lead to needing to define these things in a string like:

$ linkinator website.com/page --host-concurrency "github.com 100" --host-concurrency "espn.com 100"

Even with that, it's unclear how the per-host concurrency would interact with the top level concurrency 😵 It's unclear to me that the code complexity and the config complexity that are really worth it.

Oh, I'd not expect to set the maximum concurrency per host. Just one setting for all hosts would be enough for my use case.

$ linkinator website.com/page --concurrency 50 --host-concurrency 4

In this case I'd expect, that there is no more than 50 concurrent requests and no more than 4 per host.

Would that be easier for the implementation?

@JustinBeckwith
Copy link
Owner

That absolutely would make things easier :) There's still some complexity in managing host level concurrency along with top level concurrency, but wth, I'm at least ok giving it a shot and seeing what happens.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants