Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--paginate doesn't work well with /enterprises/{enterprise}/consumed-licenses and ?page= #8419

Closed
jessehouwing opened this issue Dec 5, 2023 · 9 comments
Labels
gh-api relating to the gh api command

Comments

@jessehouwing
Copy link

Describe the bug

Version:

gh --version
gh version 2.39.2 (2023-11-27)
https://github.com/cli/cli/releases/tag/v2.39.2

When running gh api "https://api.github.com/enterprises/{enterprise}/consumed-licenses" --paginate, it returns a broken json object:

{
   ... # page 1
}
{
   ... # page 2
}
...

So the user must loop through the pages themselves and concatenate the content like so:

$page = 1
do {
    $result =  (& gh api "https://api.github.com/enterprises/{enterprise}/consumed-licenses?per_page=100&page=$page" | ConvertFrom-Json)
    if ($result.users)
    {
        $knownEnterpriseUsers += $result.users
        $page = $page + 1
    }
} while ($result.users)

Steps to reproduce the behavior

  1. Type this gh api "https://api.github.com/enterprises/{enterprise}/consumed-licenses" --paginate
  2. Output:
(& gh api "https://api.github.com/enterprises/{enterprise}/consumed-licenses" --paginate | ConvertFrom-Json)
ConvertFrom-Json: Conversion from JSON failed with error: Additional text encountered after finished reading JSON content: {. Path '', line 1, position 72640.

Expected vs actual behavior

The content is correctly concatenated and a single json object is returned

Related: #6044

@jessehouwing jessehouwing added the bug Something isn't working label Dec 5, 2023
@cliAutomation cliAutomation added the needs-triage needs to be reviewed label Dec 5, 2023
@andyfeller andyfeller added discuss Feature changes that require discussion primarily among the GitHub CLI team and removed discuss Feature changes that require discussion primarily among the GitHub CLI team labels Dec 5, 2023
@jessehouwing
Copy link
Author

Funny, if you add --jq '.users[]' it actually stiches the results together and gives me proper JSON.

@andyfeller
Copy link
Contributor

@jessehouwing : I think you're experiencing a common behavior of gh api --paginate which is that we stream the output rather than aggregating all of it into a single response back. #1268 is an issue we're tracking about how we can improve that experience.

In the mean time, you would need to leverage jq -s to slurp up all of the fragments depending on your use case:

# should produce a single JSON array of all paginated result concatenated:
gh api graphql -f query='QUERY' --paginate --jq '.data.repository.branchProtectionRules.nodes[]' | jq -s

That said, passing page and per_page to the gh api call works as expected:

GH_DEBUG=api gh api /enterprises/{enterprise}/consumed-licenses --paginate -F page=2 -X GET -F per_page=10

@andyfeller andyfeller added gh-api relating to the gh api command and removed bug Something isn't working needs-triage needs to be reviewed discuss Feature changes that require discussion primarily among the GitHub CLI team labels Dec 7, 2023
@jessehouwing
Copy link
Author

That's quite unexpected. But reading through it, I understand the complexity. Relying on the presence of jq is indeed not obvious, since it's not a common utility on windows. The fact that there is a --jq argument would immediately lead me to think I should be able to solve it there...

@andyfeller
Copy link
Contributor

andyfeller commented Dec 8, 2023

That's quite unexpected. But reading through it, I understand the complexity. Relying on the presence of jq is indeed not obvious, since it's not a common utility on windows. The fact that there is a --jq argument would immediately lead me to think I should be able to solve it there...

That is a fair assessment.

Is there anything you would suggest we do outside of the larger work in #1268 for the short term to avoid the confusion?

For example, what might you expect to see in the gh api long description about the nature of the command and this workaround to make it more apparent?

@jessehouwing
Copy link
Author

If it could add a separator somehow it would become much easier to split the resulting text into parts so that other scripting languages can more easily work with the result sets.

But for now I'd expect this behavior to be documented here:

gh help formatting

And it mentions nothing of the sort. Many parts of gh return json objects, so the fact that some commands return malformed json by design is an interesting choice. to me I started with a small dataset, so the issue only started to appear after the dataset grew. Quite frustrating ;).

Ideally gh would register a powershell alias that would auto split the pages into pipeline objects so that it's transparent to PowerShell that this behavior exists.

@jessehouwing
Copy link
Author

jessehouwing commented Dec 8, 2023

Copilot suggests doing something akin to:

$text = '{"key1": "value1"}{"key2": "value2"}'
$splitJson = $text -split '(?<=\})(?={)'

foreach($json in $splitJson){
     $parsedJson = $json | ConvertFrom-Json
    # Now you can operate on the $parsedJson object
}

which could probably be

(gh api --paginate /something) -split '(?<=\})\s*(?={)' | ConvertFrom-Json | # now there are multiple objects

@andyfeller
Copy link
Contributor

return malformed json by design

I think the way to look at it is that gh api --paginate is returning a stream of valid JSON as it is processing bucket of data rather than a single aggregate response. You say malformed which isn't technically correct.

@jessehouwing
Copy link
Author

True. Agreed. Yet there are few tools that will be able to parse the output by default. To them it's not something they understand.

@samcoe
Copy link
Contributor

samcoe commented Dec 11, 2023

Having read through this I am fairly confident that this issue is the same as #1268 which is potentially being addressed in #5652. I am going to close this out as a duplicate, but please let me know if you think that is in error.

@samcoe samcoe closed this as completed Dec 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
gh-api relating to the gh api command
Projects
None yet
Development

No branches or pull requests

4 participants