Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add method to inject rate limiting logic #111

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tacaswell
Copy link

This is a very quick implementation of what I proposed in #19 .

Still needs tests, typing, and documentation (and better names!), but I think is a reasonable proof of concept.

Closes #19

@@ -32,12 +34,15 @@ def __init__(
oauth_token: Opt[str] = None,
cache: Opt[CACHE_TYPE] = None,
base_url: str = sansio.DOMAIN,
rate_limiter=None,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to pass in to the constructor or do via subclassing?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to inject it by passing to the constructor rather than by sub-classing so you do not end up with the outer product of (rate_limiters x I/O implementations) sub-classes.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how many subclasses even exist for this class in the wild outside of this module? I'm not sure if it's such a concern to have to worry about it. I'll have to think about it.

) -> None:
self.requester = requester
self.oauth_token = oauth_token
self._cache = cache
self.rate_limit: Opt[sansio.RateLimit] = None
self.base_url = base_url
self.requests_in_flight: int = 0
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't the rate limiter take care of this? If this was implementing via a subclass then this tracking could be wrapped around _request() by the rate limiter and the ABC doesn't have to participate at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that makes sense. will make that change.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On a bit more consideration, I think this has to be taken care of by the base class because we want to decrement the inflight count after the request completes.

I think a flaw with the current rate-limit count is that it is the value is reset on every request from github and manually decremented in _request. If you have N in flight, your available limit in total-N, but as soon as the first request comes back the rate_limit is reset to total -1. I think it is also possible for it to look like your rate limit is going up if the requests get back to you in a different order than they were processed by github. Keeping an internal count of the number of un-fulfilled requests "out" and the latest news from github about how many they think we have left is required to have a robust estimate of how many requests we actually have left.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, I just hate complicating the base class and expanding its API more.

Is requests_in_flight the best name? What about active_requests?

And do we need to have a lock in case someone uses this in threads? I'm going to say no since this entire thing is not designed for threading, but the thought did occur to me.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may also be possible to still only the current rate limit as public, but be a bit more careful about managing it. It would also have to be updated when GH comes back and says the cache is still up-to-date.

Due to the possibility of requests getting re-ordered on the way back from github, I think that to do this reliably we need to make sure that the available count does not go up unless the refresh time also updates and keep an (private?) accounting of the requests that are active.

active_requests is a better name.

Base automatically changed from master to main January 19, 2021 17:48
@Mariatta
Copy link
Member

Mariatta commented Oct 4, 2022

@tacaswell are you still working on this? I think rate limit is still a problem users face.

@tacaswell
Copy link
Author

I have not had time to come back to this (obviously). I have been working around this by putting thing like if n % 100 == 0: await sleep(1) in my code.

I can not promise that I will have time come back to this, but it can get pushed up the queue. I would not be quite happy if someone else took over this PR (or some other implementation).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Provide a way to sleep until rate limit resets
3 participants