New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: add method to inject rate limiting logic #111
base: main
Are you sure you want to change the base?
Conversation
@@ -32,12 +34,15 @@ def __init__( | |||
oauth_token: Opt[str] = None, | |||
cache: Opt[CACHE_TYPE] = None, | |||
base_url: str = sansio.DOMAIN, | |||
rate_limiter=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Better to pass in to the constructor or do via subclassing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it is better to inject it by passing to the constructor rather than by sub-classing so you do not end up with the outer product of (rate_limiters x I/O implementations) sub-classes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But how many subclasses even exist for this class in the wild outside of this module? I'm not sure if it's such a concern to have to worry about it. I'll have to think about it.
) -> None: | ||
self.requester = requester | ||
self.oauth_token = oauth_token | ||
self._cache = cache | ||
self.rate_limit: Opt[sansio.RateLimit] = None | ||
self.base_url = base_url | ||
self.requests_in_flight: int = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't the rate limiter take care of this? If this was implementing via a subclass then this tracking could be wrapped around _request()
by the rate limiter and the ABC doesn't have to participate at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense. will make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On a bit more consideration, I think this has to be taken care of by the base class because we want to decrement the inflight count after the request completes.
I think a flaw with the current rate-limit count is that it is the value is reset on every request from github and manually decremented in _request
. If you have N in flight, your available limit in total-N
, but as soon as the first request comes back the rate_limit is reset to total -1
. I think it is also possible for it to look like your rate limit is going up if the requests get back to you in a different order than they were processed by github. Keeping an internal count of the number of un-fulfilled requests "out" and the latest news from github about how many they think we have left is required to have a robust estimate of how many requests we actually have left.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, I just hate complicating the base class and expanding its API more.
Is requests_in_flight
the best name? What about active_requests
?
And do we need to have a lock in case someone uses this in threads? I'm going to say no since this entire thing is not designed for threading, but the thought did occur to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may also be possible to still only the current rate limit as public, but be a bit more careful about managing it. It would also have to be updated when GH comes back and says the cache is still up-to-date.
Due to the possibility of requests getting re-ordered on the way back from github, I think that to do this reliably we need to make sure that the available count does not go up unless the refresh time also updates and keep an (private?) accounting of the requests that are active.
active_requests
is a better name.
@tacaswell are you still working on this? I think rate limit is still a problem users face. |
I have not had time to come back to this (obviously). I have been working around this by putting thing like I can not promise that I will have time come back to this, but it can get pushed up the queue. I would not be quite happy if someone else took over this PR (or some other implementation). |
This is a very quick implementation of what I proposed in #19 .
Still needs tests, typing, and documentation (and better names!), but I think is a reasonable proof of concept.
Closes #19