Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API rate limit exceeded #422

Open
wookayin opened this issue Dec 30, 2023 · 7 comments
Open

API rate limit exceeded #422

wookayin opened this issue Dec 30, 2023 · 7 comments
Assignees
Labels
bug Something isn't working integration Related to GitHub/GitLab integration

Comments

@wookayin
Copy link

wookayin commented Dec 30, 2023

First of all, congratulations and thanks for the great work being done on Github integration #363.

Describe the bug

On a large repository where there are a lot of commits, API rate limit is always so I can't generate the release note.

My understanding from #363 is that ALL of the commits in the repository is fetched, so in this way every git cliff attempt will be turned down by an excessive usage of APIs. Even if # of the commits to be contained in a release note is small (e.g. between two minor releases), it's not like the PR information about each commit is fetched but it's like git-cliff is trying to fetch everything.

Also, it's not clear how many requests are (unawarely) being made through git cliff. Can we make it a bit more verbose?

To reproduce

neovim/neovim has ~28k commits as of today.

git clone https://github.com/neovim/neovim && cd neovim

Apply the patch:

diff --git scripts/cliff.toml scripts/cliff.toml
index 3fc10e5d1..bc1f22c56 100644
--- scripts/cliff.toml
+++ scripts/cliff.toml
@@ -1,5 +1,9 @@
 # configuration file for git-cliff (0.1.0)
 
+[remote.github]
+owner = "neovim"
+repo = "neovim"
+
 [changelog]
 # changelog header
 header = """
@@ -18,12 +22,12 @@ body = """
     ### {{ group | upper_first }}
     {% for commit in commits%}\
         {% if not commit.scope %}\
-            - {{ commit.message | upper_first }}
+            - {{ commit.message | upper_first }} (#{{ commit.github.pr_number }})
         {% endif %}\
     {% endfor %}\
     {% for group, commits in commits | group_by(attribute="scope") %}\
         {% for commit in commits %}\
-            - **{{commit.scope}}**: {{ commit.message | upper_first }}
+            - **{{commit.scope}}**: {{ commit.message | upper_first }} (#{{ commit.github.pr_number }})
         {% endfor %}\
     {% endfor %}
 {% endfor %}\n

Then run:

git cliff --config scripts/cliff.toml v0.9.4..v0.9.5  --github-token <TOKEN>

Expected behavior

No errors.

Screenshots / Logs

Bunch of API limit errors (possibly one log is per page request):

▹▹▸▹▹ Retrieving data from GitHub... (neovim/neovim)

 ERROR git_cliff_core::github    > Request error: {
  "documentation_url": "https://docs.github.com/en/free-pro-team@latest/rest/overview/rate-limits-for-the-rest-api#about-secondary-rate-limits",
  "message": "You have exceeded a secondary rate limit. Please wait a few minutes before you try again. If you reach out to GitHub Support for help, please include the request ID D7CB:1218:1333433:289A206:65902213."
}

Software information

Additional Information

If I may suggest, fetching PR information for only the relevant commits, e.g., via https://api.github.com/repos/neovim/neovim/commits/COMMIT/pulls, would be an alternative, efficient way.

https://docs.github.com/en/rest/using-the-rest-api/rate-limits-for-the-rest-api

@wookayin wookayin added the bug Something isn't working label Dec 30, 2023
Copy link

welcome bot commented Dec 30, 2023

Thanks for opening your first issue at git-cliff! Be sure to follow the issue template! ⛰️

@orhun
Copy link
Owner

orhun commented Dec 30, 2023

Hello! 🐻

First of all, congratulations and thanks for the great work being done on Github integration

Thank you! I'm glad it is being tested already before the release.

Even if # of the commits to be contained in a release note is small (e.g. between two minor releases), it's not like the PR information about each commit is fetched but it's like git-cliff is trying to fetch everything.

Yes, I mentioned the reasoning behind this briefly in #363 and the problem is generating the list of "new contributors". GitHub doesn't provide an API for easily fetching this so we need all the data for figuring out if someone has contributed to the repository in the past.

I don't know how to solve this yet. One thing we can do for large repos is that use date filters for the GitHub API requests so that we only request the relevant part of the history. But then you won't be able to generate the list for first-time contributors - but you can still list the contributors of the release.

Do you think that would be something feasible?

If I may suggest, fetching PR information for only the relevant commits, e.g., via api.github.com/repos/neovim/neovim/commits/COMMIT/pulls, would be an alternative, efficient way.

This means fetching the pull request for each commit which adds up to a lot of requests as well.

@wookayin
Copy link
Author

wookayin commented Dec 30, 2023

Thanks for the reply.

For my use cases, I don't need "first-time contributors". It would be great if there's a knob for this. If this is disabled, we might be able to save a lot of unnecessary API calls.

@orhun
Copy link
Owner

orhun commented Dec 30, 2023

Yes, I agree. I think I send less API requests based on whether "first-time contributors" variable is used in the template and document this behavior.

@orhun
Copy link
Owner

orhun commented Jan 3, 2024

I took a stab at this at #431, however it doesn't have a significant speed up for the neovim repository. Although git-cliff only requests a time frame of the commits/PRs, it still takes a lot of time due to the number of pull requests. Maybe it makes more sense to use the commit/pulls API instead as you mentioned. But that approach might also take some time in cases where there are not a lot of PRs but commits.

Any ideas?

Also, it's not clear how many requests are (unawarely) being made through git cliff. Can we make it a bit more verbose?

You can see the network requests via giving -v flag while running `git-cliff.

@wookayin
Copy link
Author

wookayin commented Jan 3, 2024

Thanks, -v is helpful. #431 reduces the number of API calls, but if I understood correctly it still tries to fetch all the PRs given a time range right? However, for me it still gives the secondary API limit error on 11-12 pages (which were requested in < 1 seconds) Guess this might be a throttling issue?

Fetching pull information for each of the commits can be also limited, because the number of commits in a big release can be thousands and we probably don't want to make thousands of pulls API calls.

@orhun
Copy link
Owner

orhun commented Jan 5, 2024

it still tries to fetch all the PRs given a time range right? However, for me it still gives the secondary API limit error on 11-12 pages (which were requested in < 1 seconds) Guess this might be a throttling issue?

Yes, it needs to fetch all the PRs in the given time range (and there is a lot 😕) There is probably a throttling issue.

Fetching pull information for each of the commits can be also limited, because the number of commits in a big release can be thousands and we probably don't want to make thousands of pulls API calls.

Exactly. :/

@orhun orhun added the integration Related to GitHub/GitLab integration label Jan 5, 2024
@orhun orhun changed the title Github integration: API rate limit exceeded API rate limit exceeded Jan 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working integration Related to GitHub/GitLab integration
Projects
None yet
Development

No branches or pull requests

2 participants