Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auto-paginated through responses spread over multiple pages #181

Open
dhirschfeld opened this issue Aug 29, 2022 · 11 comments
Open

Auto-paginated through responses spread over multiple pages #181

dhirschfeld opened this issue Aug 29, 2022 · 11 comments
Projects

Comments

@dhirschfeld
Copy link

I'm trying to get tags associated with container images by hitting the below endpoint:

  route: GET /orgs/{org}/packages/container/{container}/versions

When using this action the returned data appears to be truncated to 30 items!?!

Using the gh cli correctly returns all image ids/tags:

data=$(gh api -X GET --paginate "/orgs/${org}/packages/container/${container}/versions")
echo $data | jq 'map({(.id|tostring):  .metadata.container.tags}) | add'

Am I missing something obvious (e.g. a paginate argument) or is this a bug?

action.yml
name: Get Image Tags
description: 'Returns the tags associated with a container image.'
inputs:
  repo:
    required: true
    type: string
  container:
    required: true
    type: string
  ghcr_token:
    required: true
outputs:
  images:
    description: "A mapping of image ids to image tags."
    value: ${{ steps.set_output.outputs.images }}
runs:
  using: "composite"
  steps:
    - id: parse_repo
      shell: bash
      run: |
        set -euox pipefail
        IFS='/' read owner repo <<< ${{ inputs.repo }}
        echo "::set-output name=org::${owner}"
        echo "::set-output name=container::${repo,,}/${{ inputs.container }}"
    - id: get_images
      # https://github.com/octokit/request-action
      uses: octokit/request-action@v2.x
      env:
        GITHUB_TOKEN: ${{ inputs.ghcr_token }}
      with:
        org: ${{ steps.parse_repo.outputs.org }}
        container: ${{ steps.parse_repo.outputs.container }}
        route: GET /orgs/{org}/packages/container/{container}/versions
    - id: set_output
      shell: bash
      run: |
        set -euox pipefail
        images=$(echo -e ${{ toJSON(steps.get_images.outputs.data) }} | jq -c 'map({(.id|tostring): .metadata.container.tags}) | add')
        echo "::set-output name=images::${images}"
@ghost ghost added this to Inbox in JS Aug 29, 2022
@dhirschfeld
Copy link
Author

Yup, looks like a pagination issue. In the debug logs I see this in the headers:

"link":
    <https://api.github.com/organizations/<ORG>/packages/container/<CONTAINER>/versions?page=2&per_page=30>; rel="next",
    <https://api.github.com/organizations/<ORG>/packages/container/<CONTAINER>/versions?page=2&per_page=30>; rel="last"

@dhirschfeld
Copy link
Author

So, the question is how to return all results, similarly to the --paginate argument in the gh cli?

@dhirschfeld
Copy link
Author

Ok, changing the UR to the below works:

  route: GET /orgs/{org}/packages/container/{container}/versions?per_page=100

...but this will break if you have >100 image versions 🤢

That's just a hack - it doesn't fix the issue, it just pushes the goalposts a bit further out.

If there really is no way to specify to return all results (similar to the CLI) I guess this issue can be turned into a feature request for that functionality.

@gr2m
Copy link
Contributor

gr2m commented Aug 30, 2022

Sorry pagination is not supported by this action. It's a supper thin wrapper to send requests to GitHub's REST API. I'd suggest to use an alternative action in this case.

@gr2m gr2m closed this as not planned Won't fix, can't repro, duplicate, stale Aug 30, 2022
JS automation moved this from Inbox to Done Aug 30, 2022
@dhirschfeld
Copy link
Author

Silent truncation of data is a pretty user-hostile default which I think deserves a big warning in the README.

Warning
This action only returns the first page of data, which by default is only the first 30 results.
If you need more than 30 results you can append the ?per_page=100 query to your url.
If you need more than 100 results you will need to use another tool to request all of the data.

@gr2m
Copy link
Contributor

gr2m commented Aug 31, 2022

The action is a tool to utilize GitHub's REST API, the REST API documentation should be clear enough about the pagination limitation.

I do agree however that the description at https://docs.github.com/en/rest/packages#get-all-package-versions-for-a-package-owned-by-an-organization is misleading.

Returns all package versions for a package owned by an organization

Should be reworded to

Lists package versions for a package owned by an organization

That would be coherent with other List ... endpoints. It would be great if you could file an issue with the docs team

@timrogers
Copy link
Contributor

The action is a tool to utilize GitHub's REST API, the REST API documentation should be clear enough about the pagination limitation.

I do agree however that the description at https://docs.github.com/en/rest/packages#get-all-package-versions-for-a-package-owned-by-an-organization is misleading.

Returns all package versions for a package owned by an organization

Should be reworded to

Lists package versions for a package owned by an organization

That would be coherent with other List ... endpoints. It would be great if you could file an issue with the docs team

I've created a PR internally at GitHub to fix this. Good spot!

@timrogers
Copy link
Contributor

timrogers commented Sep 1, 2022

@gr2m Do you think it would be reasonable to add support for auto-pagination in this action? We do have the pagination plugin for Octokit, so it doesn't seem like a huge over-reach. We would have to think about whether we should paginate by default.

@dhirschfeld
Copy link
Author

Do you think it would be reasonable to add a support for auto-pagination in this action?

I very much think it will be worthwhile. Even though my use-case isn't mission-critical, I still plan to move to a solution which lets me get all of the data - hopefully I won't need to!

It's all well and good targeting a *"thin wrapper", however the REST API implements a mechanism to get all of your data - just follow the links recursively. gh api is also a thin wrapper and it has implemented the pagination in a way that makes sense for that tool. Without providing a mechanism to implement pagination in this tool it's only implementing subset of the functionality.

For me, a tool designed to get data which by default will silently truncate your data and provides no mechanism to actually get all of your data doesn't make much sense.

@timrogers
Copy link
Contributor

I think this would be a reasonable thing to add, but I don't think I'm going to have time to add it myself. I'll reopen this issue so the request is tracked.

@timrogers timrogers changed the title Data is truncated to only 30 items?! Auto-paginated through responses spread over multiple pages Sep 27, 2022
@timrogers timrogers reopened this Sep 27, 2022
@gr2m
Copy link
Contributor

gr2m commented Sep 27, 2022

@gr2m Do you think it would be reasonable to add support for auto-pagination in this action

I would prefer we do not add auto-pagination.

There are tons of other features we could implement that is part of Octokit but not of this action. Before going down this path I'd suggest we update the README and link to workarounds if pagination is needed.

  1. You could create a dedicated action that does the pagination and link to that action
  2. We could link to https://github.com/actions/github-script

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
JS
  
Done
Development

No branches or pull requests

3 participants