Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge JSON responses from gh api #8620

Merged
merged 11 commits into from Apr 17, 2024
105 changes: 89 additions & 16 deletions pkg/cmd/api/api.go
Expand Up @@ -29,6 +29,10 @@ import (
"github.com/spf13/cobra"
)

const (
ttyIndent = " "
)

type ApiOptions struct {
AppVersion string
BaseRepo func() (ghrepo.Interface, error)
Expand All @@ -48,6 +52,7 @@ type ApiOptions struct {
Previews []string
ShowResponseHeaders bool
Paginate bool
Slurp bool
Silent bool
Template string
CacheTTL time.Duration
Expand Down Expand Up @@ -114,7 +119,9 @@ func NewCmdApi(f *cmdutil.Factory, runF func(*ApiOptions) error) *cobra.Command
In %[1]s--paginate%[1]s mode, all pages of results will sequentially be requested until
there are no more pages of results. For GraphQL requests, this requires that the
original query accepts an %[1]s$endCursor: String%[1]s variable and that it fetches the
%[1]spageInfo{ hasNextPage, endCursor }%[1]s set of fields from a collection.
%[1]spageInfo{ hasNextPage, endCursor }%[1]s set of fields from a collection. Each page is a separate
JSON array or object. Pass %[1]s--slurp%[1]s to wrap all pages of JSON arrays or objects
into an outer JSON array.
`, "`"),
Example: heredoc.Doc(`
# list releases in the current repository
Expand Down Expand Up @@ -167,6 +174,23 @@ func NewCmdApi(f *cmdutil.Factory, runF func(*ApiOptions) error) *cobra.Command
}
}
'

# get the percentage of forks for the current user
# without --slurp you will get a different percentage for each page
heaths marked this conversation as resolved.
Show resolved Hide resolved
$ gh api graphql --paginate --slurp -f query='
query($endCursor: String) {
viewer {
repositories(first: 100, after: $endCursor) {
nodes { isFork }
pageInfo {
hasNextPage
endCursor
}
}
}
}
' | jq 'def count(e): reduce e as $_ (0;.+1);
[.[].data.viewer.repositories.nodes[]] as $r | count(select($r[].isFork))/count($r[])'
`),
Annotations: map[string]string{
"help:environment": heredoc.Doc(`
Expand Down Expand Up @@ -209,6 +233,19 @@ func NewCmdApi(f *cmdutil.Factory, runF func(*ApiOptions) error) *cobra.Command
return err
}

if opts.Slurp && !opts.Paginate {
return cmdutil.FlagErrorf("`--paginate` required when passing `--slurp`")
}

if err := cmdutil.MutuallyExclusive(
"the `--slurp` option is not supported with `--jq` or `--template`",
opts.Slurp,
opts.FilterOutput != "",
opts.Template != "",
); err != nil {
return err
}

if err := cmdutil.MutuallyExclusive(
"only one of `--template`, `--jq`, `--silent`, or `--verbose` may be used",
opts.Verbose,
Expand All @@ -233,6 +270,7 @@ func NewCmdApi(f *cmdutil.Factory, runF func(*ApiOptions) error) *cobra.Command
cmd.Flags().StringArrayVarP(&opts.RequestHeaders, "header", "H", nil, "Add a HTTP request header in `key:value` format")
cmd.Flags().StringSliceVarP(&opts.Previews, "preview", "p", nil, "GitHub API preview `names` to request (without the \"-preview\" suffix)")
cmd.Flags().BoolVarP(&opts.ShowResponseHeaders, "include", "i", false, "Include HTTP response status line and headers in the output")
cmd.Flags().BoolVar(&opts.Slurp, "slurp", false, "Use with \"--paginate\" to return an array of all pages of either JSON arrays or objects")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @heaths, one of the issues with long running PRs like this is remembering all the context. When I played around with this just now I anticipated that for REST endpoints --paginate --slurp would be the same as --paginate since the content of REST responses already get slurped into an array. However what happens is we end up doubly nesting the array:

[
  [
    {
      "url": "https://api.github.com/repos/cli/cli/releases/149628351",
...
    }
  ]
]

Maybe my assumption is bad and we always want to apply the same semantics to --slurp even if it produces a strange result - we already know that the whole pagination is kind of quirky. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was how it originally worked, but then someone noted it didn't "solve" their problem of merging nested arrays - only top-level arrays. I still think that's up to business logic - there's no way for us to know which level of arrays to merge - but in the discussion above I agreed to just wrap everything in an array like jq --slurp so that we at least return valid JSON. That's all this PR has ever been about: returning valid JSON such that you can pipe it to something that requires it - not jq, but something like PowerShell's ConvertFrom-Json.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember that discussion as it related to the GQL requests and I agree that we cannot know that and the consumer needs to determine that themselves.

To be clear, here I'm describing the existing merging of JSON array REST responses already getting slurped by --paginate with no further work (as far as I can tell). Below in the screenshot you can see the difference between ./bin/gh api --paginate repos/cli/go-gh/release and ./bin/gh api --paginate --slurp repos/cli/go-gh/release. The --slurp results in a double nesting of arrays with the top level one only having one item.

It's just a bit of a weird situation that the REST pagination ends up creating a single array object that the slurp then wraps in another array. I'm not sure there is a good solution to this though, except perhaps only allowing --slurp on graphql paginated requests? In any case someone is liable to get confused, so maybe we just ship this and move on with our lives.

image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that was the compromised we discussed in our meeting with @andyfeller as well: we'd mimic jq --slurp exactly, which also double-nests arrays. This way, the result is consistent: it's always an array. The array items are each page of the response regardless of whether it's an array or object. No guesswork as to how best to handle the output: it's always an array of each page. If you don't need that for a REST response of arrays, don't pass --slurp: you'll already get a merged array because of Nate's previous change based on my old PR.

I'd rather have consistency than conditional output formats myself. Easier to code without having to think of whether I'm making a REST call or GraphQL call.

cmd.Flags().BoolVar(&opts.Paginate, "paginate", false, "Make additional HTTP requests to fetch all pages of results")
cmd.Flags().StringVar(&opts.RequestInputFile, "input", "", "The `file` to use as body for the HTTP request (use \"-\" to read from standard input)")
cmd.Flags().BoolVar(&opts.Silent, "silent", false, "Do not print the response body")
Expand Down Expand Up @@ -265,10 +303,38 @@ func apiRun(opts *ApiOptions) error {
method = "POST"
}

var bodyWriter io.Writer = opts.IO.Out
var headersWriter io.Writer = opts.IO.Out
if opts.Silent {
bodyWriter = io.Discard
}
if opts.Verbose {
// httpClient handles output when verbose flag is specified.
bodyWriter = io.Discard
headersWriter = io.Discard
}

if opts.Paginate && !isGraphQL {
requestPath = addPerPage(requestPath, 100, params)
}

// Execute defers in FIFO order.
deferQueue := queue{}
defer deferQueue.Close()

// Similar to `jq --slurp`, write all pages JSON arrays or objects into a JSON array.
if opts.Paginate && opts.Slurp {
w := &jsonArrayWriter{
Writer: bodyWriter,
color: opts.IO.ColorEnabled(),
}
deferQueue.Enqueue(func() {
_ = w.Close()
})

bodyWriter = w
}

if opts.RequestInputFile != "" {
file, size, err := openUserFile(opts.RequestInputFile, opts.IO.In)
if err != nil {
Expand Down Expand Up @@ -316,23 +382,12 @@ func apiRun(opts *ApiOptions) error {

if !opts.Silent {
if err := opts.IO.StartPager(); err == nil {
defer opts.IO.StopPager()
deferQueue.Enqueue(opts.IO.StopPager)
} else {
fmt.Fprintf(opts.IO.ErrOut, "failed to start pager: %v\n", err)
}
}

var bodyWriter io.Writer = opts.IO.Out
var headersWriter io.Writer = opts.IO.Out
if opts.Silent {
bodyWriter = io.Discard
}
if opts.Verbose {
// httpClient handles output when verbose flag is specified.
bodyWriter = io.Discard
headersWriter = io.Discard
}

host, _ := cfg.Authentication().DefaultHost()

if opts.Hostname != "" {
Expand All @@ -358,6 +413,12 @@ func apiRun(opts *ApiOptions) error {
requestBody = nil // prevent repeating GET parameters
}

// Tell optional jsonArrayWriter to start a new page.
err = startPage(bodyWriter)
if err != nil {
return err
}

endCursor, err := processResponse(resp, opts, bodyWriter, headersWriter, tmpl, isFirstPage, !hasNextPage)
if err != nil {
return err
Expand Down Expand Up @@ -417,7 +478,7 @@ func processResponse(resp *http.Response, opts *ApiOptions, bodyWriter, headersW
// TODO: reuse parsed query across pagination invocations
indent := ""
if opts.IO.IsStdoutTTY() {
indent = " "
indent = ttyIndent
}
err = jq.EvaluateFormatted(responseBody, bodyWriter, opts.FilterOutput, indent, opts.IO.ColorEnabled())
if err != nil {
Expand All @@ -429,9 +490,9 @@ func processResponse(resp *http.Response, opts *ApiOptions, bodyWriter, headersW
return
}
} else if isJSON && opts.IO.ColorEnabled() {
err = jsoncolor.Write(bodyWriter, responseBody, " ")
err = jsoncolor.Write(bodyWriter, responseBody, ttyIndent)
} else {
if isJSON && opts.Paginate && !isGraphQLPaginate && !opts.ShowResponseHeaders {
if isJSON && opts.Paginate && !opts.Slurp && !isGraphQLPaginate && !opts.ShowResponseHeaders {
responseBody = &paginatedArrayReader{
Reader: responseBody,
isFirstPage: isFirstPage,
Expand Down Expand Up @@ -626,3 +687,15 @@ func previewNamesToMIMETypes(names []string) string {
}
return strings.Join(types, ", ")
}

type queue []func()

func (q *queue) Enqueue(f func()) {
*q = append(*q, f)
}

func (q *queue) Close() {
for _, f := range *q {
f()
}
}