Add table github_repository_content #317

ParthaI · 2023-08-18T12:41:49Z

Example query results

Query: steampipe query "select repository_full_name, type, name, path from github_repository_content where repository_full_name = 'turbot/steampipe-plugin-aws'"

Result:

output.json

github/table_github_repository_content.go

graza-io · 2023-08-23T10:38:00Z

github/table_github_repository_content.go

+							SubTree struct {
+								Entries []struct {
+									Name        githubv4.String
+									Path        githubv4.String
+									Size        githubv4.Int
+									LineCount   githubv4.Int
+									Mode        githubv4.Int
+									PathRaw     githubv4.String
+									IsGenerated githubv4.Boolean
+									Type        githubv4.String
+									Object      struct {
+										Blob struct {
+											Oid            githubv4.String
+											AbbreviatedOid githubv4.String
+											Text           githubv4.String
+											IsBinary       githubv4.Boolean
+											CommitUrl      githubv4.String
+										} `graphql:"... on Blob"`
+									}
+								}
+							} `graphql:"... on Tree"`


This limits us to one level of directories from the path entered (or repo root), do we want only one level of directories or do we need to figure out how to parse deeper? @cbruno10, thoughts?

@cbruno10 / @graza-io, I've delved into a deeper examination of parsing GitHub file contents deeper level. Kindly take a moment to review my findings.

I attempted to fetch all file contents from a repository by recursively executing a GraphQL query. However, I consistently encountered a Error: non-200 OK status code: 502 Bad Gateway body error.

The analysis was carried out on the turbot/steampipe-plugin-aws repository, which contains a significant number of files.

Despite configuring a rate limiter at the plugin level, I achieved no success.

I refined the GraphQL query to retrieve file content down to the 3rd, 4th, and 5th levels. Yet, in all scenarios, I faced the same error.

Error: non-200 OK status code: 502 Bad Gateway body: "{\n \"data\": null,\n \"errors\":[\n {\n \"message\":\"Something went wrong while executing your query. This may be the result of a timeout, or it could be a GitHub bug. Please include `04E3:1AE1DB:1814A98:18ECE96:660C236B` when reporting this issue.\"\n }\n ]\n}\n" (SQLSTATE HV000)

It may fail due to insufficient storage if the repository has a larger file content.

Based on my observations, attempting to fetch the contents of all files in a repository up to the nᵗʰ level tends to be error-prone. On the other hand, we offer flexibility by allowing users to specify the file path for which they wish to obtain content details. By including the repository_content_path value in the where clause, we can target the retrieval of file contents from a specific directory within a repository.

I greatly value your feedback and suggestions.

Thank you!

@ParthaI Is this behaviour the same compared to when we used the v3 APIs? Are we able to get nested files with that API/the previous table implemented proposed in #207?

@cbruno10, the modifications in previous PR #207 were limited to retrieving file contents at the root level only. In the latest PR, we've advanced our approach to parse file contents down to one level down the root.

Hi @ParthaI ! I don't understand why you are talking about root level only in the #207.
I was able to retrieve sub directory's file content with this PR.

What do I not understand ?

Hello @aminvielledebatAtBedrock,

I apologize for not updating you earlier regarding your PR here.

The original PR utilized the GitHub REST API to populate column values, and we experienced recursive API calls.

Based on our previous experiences, the REST API tends to be more error-prone, primarily due to rate limit errors.

In the current PR, we have shifted from using the REST API to a GraphQL query to enhance efficiency and reliability.

However, please be aware that GraphQL does not support fetching file content to an arbitrary depth within a repository. Extending the nodes in a GraphQL query to retrieve file content beyond one level may result in throttling errors in the case if the repository has a huge set of file content. Query Reference.

We've addressed this by constructing a GraphQL query that retrieves file content up to one level deep and uses a recursive approach for deeper levels as needed.

Additionally, we've handled potential throttling errors that may occur when a repository contains a substantial amount of content by appropriately structuring our GraphQL queries.

We have updated this current PR to ensure all file content under a repository is accessible as intended.

Thanks!

github/table_github_repository_content.go

github-actions · 2023-10-22T23:31:44Z

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

aminvielledebatAtBedrock · 2023-11-06T16:17:32Z

Hi @ParthaI ! Do you have any update on this PR ?

…ampipe-plugin-github into add-github-repo-content-table

ParthaI · 2023-11-07T11:00:56Z

@aminvielledebatAtBedrock, I appreciate your interest in the PR. Currently, This PR is under review. I'll provide you with an update once I have more information. Your patience is greatly appreciated!

gforien · 2023-12-18T11:18:08Z

Hi @ParthaL, we are also intereset in this PR
Do you have updates on this please ?

github-actions · 2024-02-16T23:31:44Z

This PR is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 30 days.

aminvielledebatAtBedrock · 2024-02-27T09:18:27Z

Hi @graza-io , could you remove the stale label please ?

We still need this new table :-)

ParthaI · 2024-04-02T07:59:20Z

@aminvielledebatAtBedrock, Just an update, the PR is able to get the repo content up to one level of directories from the path entered (or repo root). We are figuring out a way to get all the details up to the n level of the directory.

Thank you for your patience!

… add-github-repo-content-table

misraved

@ParthaI please take a look at the minor review comments. Thanks!!

github/table_github_repository_content.go

aminvielledebatAtBedrock and others added 3 commits August 15, 2023 15:30

Add github_repository_content table (#207)

20de5e4

Updated the steampipe plugin version to v5

37f89ad

Replaced the Rest API with GraphQL query

dcc5d02

ParthaI requested review from cbruno10 and graza-io August 18, 2023 12:41

ParthaI self-assigned this Aug 18, 2023

Added the rate limit configuration to query

d50f333

graza-io reviewed Aug 23, 2023

View reviewed changes

github/table_github_repository_content.go Outdated Show resolved Hide resolved

graza-io reviewed Aug 23, 2023

View reviewed changes

github/table_github_repository_content.go Outdated Show resolved Hide resolved

github-actions bot added the stale No recent activity has been detected on this issue/PR and it will be closed label Oct 22, 2023

github-actions bot removed the stale No recent activity has been detected on this issue/PR and it will be closed label Nov 6, 2023

ParthaI and others added 3 commits November 7, 2023 16:10

Removed commented code

ce4bb99

Removed the sha collumn because the oid column will have same value

f14cc16

Merge branch 'add-github-repo-content-table' of github.com:turbot/ste…

b20580c

…ampipe-plugin-github into add-github-repo-content-table

Removed the unused functions

3232303

github-actions bot added the stale No recent activity has been detected on this issue/PR and it will be closed label Feb 16, 2024

ParthaI removed the stale No recent activity has been detected on this issue/PR and it will be closed label Feb 27, 2024

ParthaI added 5 commits April 3, 2024 11:35

Merge branch 'main' of github.com:turbot/steampipe-plugin-github into…

bc4ea86

… add-github-repo-content-table

Updated the doc to match the latest doc format

f5c6cb7

Merge branch 'main' of github.com:turbot/steampipe-plugin-github into…

d8c6929

… add-github-repo-content-table

Updated the table to get get the file content upto nth level

b5984b9

Updated the doc

69bc4c1

ParthaI and others added 2 commits April 12, 2024 12:58

Update

087c586

Update github_repository_content.md

da1b912

misraved reviewed May 17, 2024

View reviewed changes

github/table_github_repository_content.go Outdated Show resolved Hide resolved

github/table_github_repository_content.go Show resolved Hide resolved

Cleand up the log statement.

b27cebf

misraved approved these changes May 17, 2024

View reviewed changes

misraved merged commit 20986e7 into main May 17, 2024
1 check passed

misraved deleted the add-github-repo-content-table branch May 17, 2024 06:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add table github_repository_content #317

Add table github_repository_content #317

ParthaI commented Aug 18, 2023 •

edited

graza-io Aug 23, 2023

ParthaI Apr 3, 2024

cbruno10 Apr 4, 2024

ParthaI Apr 4, 2024 •

edited

aminvielledebatAtBedrock May 14, 2024

ParthaI May 14, 2024 •

edited

github-actions bot commented Oct 22, 2023

aminvielledebatAtBedrock commented Nov 6, 2023

ParthaI commented Nov 7, 2023

gforien commented Dec 18, 2023

github-actions bot commented Feb 16, 2024

aminvielledebatAtBedrock commented Feb 27, 2024

ParthaI commented Apr 2, 2024

misraved left a comment

Add table github_repository_content #317

Add table github_repository_content #317

Conversation

ParthaI commented Aug 18, 2023 • edited

Example query results

graza-io Aug 23, 2023

Choose a reason for hiding this comment

ParthaI Apr 3, 2024

Choose a reason for hiding this comment

cbruno10 Apr 4, 2024

Choose a reason for hiding this comment

ParthaI Apr 4, 2024 • edited

Choose a reason for hiding this comment

aminvielledebatAtBedrock May 14, 2024

Choose a reason for hiding this comment

ParthaI May 14, 2024 • edited

Choose a reason for hiding this comment

github-actions bot commented Oct 22, 2023

aminvielledebatAtBedrock commented Nov 6, 2023

ParthaI commented Nov 7, 2023

gforien commented Dec 18, 2023

github-actions bot commented Feb 16, 2024

aminvielledebatAtBedrock commented Feb 27, 2024

ParthaI commented Apr 2, 2024

misraved left a comment

Choose a reason for hiding this comment

ParthaI commented Aug 18, 2023 •

edited

ParthaI Apr 4, 2024 •

edited

ParthaI May 14, 2024 •

edited