Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(github): use REST for etag caching of issues #26793

Merged
merged 16 commits into from Feb 28, 2024

Conversation

rarkins
Copy link
Collaborator

@rarkins rarkins commented Jan 22, 2024

Changes

Refactor getIssues() to use github REST API and benefit from etag caching

Context

Documentation (please check one with an [x])

  • I have updated the documentation, or
  • No documentation update is required

How I've tested my work (please select one)

I have verified these changes via:

  • Code inspection only, or
  • Newly added/modified unit tests, or
  • No unit tests but ran on a real repository, or
  • Both unit tests + ran on a real repository

@rarkins
Copy link
Collaborator Author

rarkins commented Jan 22, 2024

Merge #26793 first

@rarkins rarkins changed the title feat(cache): etag caching for github GET feat(github): etag cache issue list Jan 22, 2024
@rarkins
Copy link
Collaborator Author

rarkins commented Jan 22, 2024

I need to think what the effect is if there's multiple pages. Maybe should cache the entire list. Also time to check page size and count again

@rarkins rarkins marked this pull request as draft January 22, 2024 07:48
@rarkins rarkins changed the title feat(github): etag cache issue list feat(github): use REST for etag caching of issues Jan 22, 2024
lib/modules/platform/github/index.ts Outdated Show resolved Hide resolved
lib/modules/platform/github/index.ts Outdated Show resolved Hide resolved
Co-authored-by: Michael Kriese <michael.kriese@visualon.de>
Base automatically changed from feat/26773-etag-repo-caching to main January 22, 2024 11:56
# Conflicts:
#	lib/util/cache/repository/types.ts
#	lib/util/http/index.ts
@rarkins rarkins marked this pull request as ready for review January 22, 2024 14:24
viceice
viceice previously approved these changes Jan 22, 2024
lib/util/http/github.ts Outdated Show resolved Hide resolved
@zharinov
Copy link
Collaborator

zharinov commented Feb 6, 2024

Seems like etags for issues/PR lists are very unstable: https://twitter.com/CAFxX/status/1490560328261791746
Example:

  • Query issue list on some small repo
  • Go to the Pull Requests and add comment
  • Update issue list and observe different etag

This is why we ended up with dropping them for cached PR list.

@rarkins
Copy link
Collaborator Author

rarkins commented Feb 6, 2024

Now I remember that. Can you use the If-Modified-Since approach instead?

@zharinov
Copy link
Collaborator

zharinov commented Feb 6, 2024

I've got 304 only 2 times out of 20 when querying test repo with If-Modified-Since with value from the Date header of the previous one. So seems like it doesn't work too.

@rarkins
Copy link
Collaborator Author

rarkins commented Feb 6, 2024

So let's go with etag for now but add an issue to investigate modified date? I thought I for a reasonable amount of success with 304s

@zharinov
Copy link
Collaborator

zharinov commented Feb 6, 2024

If-Modified-Since gives 18/20 false cache resets for repository with zero activity. Basically, seems like if we rely on how ETag and If-Modified-Since work for GitHub, we could just disable any caching.

@rarkins
Copy link
Collaborator Author

rarkins commented Feb 6, 2024

Maybe it's particularly bad for the lists of issues and PRs?

@zharinov
Copy link
Collaborator

zharinov commented Feb 6, 2024

I think there could be another approach: thanks to repoInit(), we're doing at least one GraphQL request anyway, right? We could pre-fetch some small, but reasonable amount of issues and PRs with this init query, update our caches if needed, and then decide whether to perform further sync or we're good.

@rarkins
Copy link
Collaborator Author

rarkins commented Feb 7, 2024

Is there any downside to switching from graohql to rest here though? Even if the tag frequently doesn't work, maybe 20-30% it does

@rarkins
Copy link
Collaborator Author

rarkins commented Feb 7, 2024

@zharinov It's still TBD whether to switch initRepo to REST too :) can we do the same issue/PR filtering in that unit repo query?

@zharinov
Copy link
Collaborator

zharinov commented Feb 7, 2024

Well, I didn't remember we already switched to REST for PRs, so maybe yes and probably we could just fetch all from /repos/:owner/:repo/issues endpoint.

Speaking of cache hit rate, we could count it, though my bet is 5-10%

Copy link
Member

@viceice viceice left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

otherwise LGTM

@rarkins rarkins added this pull request to the merge queue Feb 28, 2024
Merged via the queue into main with commit 924b9da Feb 28, 2024
35 checks passed
@rarkins rarkins deleted the feat/github-issue-list-rest branch February 28, 2024 05:28
@renovate-release
Copy link
Collaborator

🎉 This PR is included in version 37.219.0 🎉

The release is available on:

Your semantic-release bot 📦🚀

@mtardy
Copy link
Contributor

mtardy commented Feb 28, 2024

Hey, this morning, 37.219.0 ran on my repository and I think this is the reason why Renovate didn't find its dashboard issue and recreated a new one.

Yesterday evening I got:

DEBUG: findIssue(Dependency Dashboard) (repository=cilium/tetragon)
DEBUG: Retrieving issueList (repository=cilium/tetragon)
DEBUG: Retrieved 1 issues (repository=cilium/tetragon)
DEBUG: Found issue 1054 (repository=cilium/tetragon)
DEBUG: http cache: saving https://api.github.com/repos/cilium/tetragon/issues/1054 (etag=W/"4f3474e44487e1c736d399d19e372a5f6d8280f1e8e733bbbc2e4a46666820d3", lastModified=Tue, 27 Feb 2024 17:08:41 GMT) (repository=cilium/tetragon)

This morning:

DEBUG: findIssue(Dependency Dashboard) (repository=cilium/tetragon)
DEBUG: Retrieving issueList (repository=cilium/tetragon)
DEBUG: http cache: saving https://api.github.com/repos/cilium/tetragon/issues?creator=cilium-renovate[bot]&state=all (etag=W/"f6c20b2f8fe7c51530b52293ea1fb943ee97c3d570f3310591903b76fb27c79f", lastModified=undefined) (repository=cilium/tetragon)
DEBUG: Retrieved 300 issues (repository=cilium/tetragon)

So it recreated a new issue and now it seems the original one is not closed but abandoned (because it doesn't see it):

I'm not sure what will happen if I rollback and keep both dashboards.

@mtardy
Copy link
Contributor

mtardy commented Feb 28, 2024

sadly the rest API returns pulls as issues 😔 so I don't think this will really help as @zharinov already said.

maybe hide the change behind a feature flag, so the app can test it on some orgs first?

A colleague of mine mentioned that it might be the issue:

graphql -> listing issues only
rest api -> lists issues and PRs

That + a pagination issue might explain why it could not find its own "old" dashboard.

rarkins added a commit that referenced this pull request Feb 28, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants