Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP : Support Tabular Data formatting in Buildifier #985

Open
wants to merge 32 commits into
base: master
Choose a base branch
from

Conversation

SKashyap
Copy link

What?

This change adds to the Buildifier the ability to format tabular data using a set of new tags as follows :

Format tabular data in proper visual alignment
#buildifier: format table
Sort tabular data (by column) alongside the allignment
#buildifier: format table sort <column-num>

Why?

The need for the change can be seen with the following example :
Use an input file :

tableEx = [
    ("col1",      "col2",     "col3"),
    ("col1111", "col222", "col3333")    
]

Run it through buildifier to see the following output :

tableEx = [
    ("col1","col2","col3"),
    ("col1111","col222","col3333")    
]

Detailing the problem statement :

  • Tabular Data can be either "List of Tuples"/ "List of Lists". (See ListExpr , TupleExpr)
  • The buildifier's parser stage freezes how an Expr will be printed using flags file ForceCompaction for Tuples , and ForceMultiline for Lists.
  • These fixed notions of handling the print logic make it difficult to operate with tabular data as shown in the above example.
  • Even if the user formats the table himself, the buildfier cannot be indicated "#buildifer : Don't mess with my table". It ruins the user-provided format too. #leaveAlone does not work either.
  • We came across this problem when we were building large RPMs with 200 endstates, each having a location of install/permissions, etc as table data.

How?

Design Description :

  • Add support for 2 new tags to be parsed with rewrite.go :
    #buildifier: format table
    #buildifier: format table sort <column-num>
  • Existing architecture of buildifier allows rewriting the AST within rewrite.go based on parsed buildifier tags.
  • Utilize the same pipeline to "mark" the nodes to form the TableRoot and the TableRows based on the presence of the new tags.
  • Within print.go, flex the logic to allow printing a tuple /List in a tabular form when they are identified a TableRoot/TableRow node.

Printing of Tabular data :

  • We choose an existing golang package called TabWriter to jump-start us into this feature without writing custom logic to format rows/column sizes out the output.
  • TabWriter implements "Elastic Tabstops algorithm "

Other approaches evaluated :

  • We also evaluated an approach of creating a new Expr class called TableListExpr, TableRowExpr - It did not provide any advantage over the existing design given the current problem.
    We can revisit this later if the use cases demand it.

Sorting of Tabular data :

  • Rewrite.go logic where we detect the new tags also sort the row entries based on the index provided in the tag.

Future implementations:

  • This changeset only supports a "List of Tuples" as a table. A similar extension can be made for "List of List" as well.

Testing?

  • 15 testcases added to support the use-cases
  • bazel test ... : PASSED.

Any other info?

There are a couple of known drawbacks due to the TabWriter behavior :

  • Especially in the case of "Before Comments" in tableRows. We will be curating them into a document for reviewers soon.

elenaodimitrova and others added 19 commits June 17, 2021 15:38
…s are set to mark them as contenders for table formatting.

This will later to set these data whenever a correct buidifier tag is parsed.
…etect a buildifier tag.

INFO: Build completed, 1 test FAILED, 60 total actions
//api_proto:api.gen.pb.go_checkshtest                           (cached) PASSED in 0.5s
//build_proto:build.gen.pb.go_checkshtest                       (cached) PASSED in 0.4s
//deps_proto:deps.gen.pb.go_checkshtest                         (cached) PASSED in 0.9s
//extra_actions_base_proto:extra_actions_base.gen.pb.go_checkshtest (cached) PASSED in 0.8s
//labels:go_default_test                                        (cached) PASSED in 0.6s
//lang:tables.gen.go_checkshtest                                (cached) PASSED in 0.7s
//tables:go_default_test                                        (cached) PASSED in 1.1s
//wspace:go_default_test                                        (cached) PASSED in 0.6s
//buildifier:buildifier_integration_test                                 PASSED in 2.5s
//buildifier/npm:integration_test                                        PASSED in 0.5s
//buildifier/utils:go_default_test                                       PASSED in 0.3s
//buildozer:buildozer_test                                               PASSED in 8.2s
//buildozer/npm:integration_test                                         PASSED in 0.6s
//bzlenv:bzlenv_test                                                     PASSED in 0.4s
//bzlenv:go_default_test                                                 PASSED in 0.6s
//edit:go_default_test                                                   PASSED in 0.9s
//unused_deps:go_default_test                                            PASSED in 0.6s
//unused_deps:jar_manifest_test                                          PASSED in 0.5s
//warn:go_default_test                                                   PASSED in 1.1s
//warn/docs:go_default_test                                              PASSED in 0.4s
//build:go_default_test                                                  FAILED in 1.2s
  /private/var/tmp/_bazel_kshwetha/3198714a19bbf3371758516069d33ca0/execroot/com_github_bazelbuild_buildtools/bazel-out/darwin-fastbuild/testlogs/build/go_default_test
```
panic: runtime error: index out of range [0] with length 0 [recovered]
	panic: runtime error: index out of range [0] with length 0

goroutine 16 [running]:
testing.tRunner.func1.1(0x11cd060, 0xc000524fe0)
	GOROOT/src/testing/testing.go:1072 +0x30d
testing.tRunner.func1(0xc000082900)
	GOROOT/src/testing/testing.go:1075 +0x41a
panic(0x11cd060, 0xc000524fe0)
	GOROOT/src/runtime/panic.go:969 +0x1b9
github.com/bazelbuild/buildtools/build.formatTables.func1(0x121b980, 0xc000198a80, 0xc0001a1e00, 0x3, 0x8)
	build/rewrite.go:415 +0xeb

goroutine 13 [running]:
testing.tRunner.func1.1(0x11cd060, 0xc0000add40)
	GOROOT/src/testing/testing.go:1072 +0x30d
testing.tRunner.func1(0xc000582300)
	GOROOT/src/testing/testing.go:1075 +0x41a
panic(0x11cd060, 0xc0000add40)
	GOROOT/src/runtime/panic.go:969 +0x1b9
github.com/bazelbuild/buildtools/build.sortStringLists.func1(0x121b980, 0xc000222620, 0xc000219b00, 0x3, 0x8)
	build/rewrite.go:488 +0x72f
github.com/bazelbuild/buildtools/build.Walk.func1(0xc000219530, 0xc000219b00, 0x3, 0x8, 0x0, 0x0)
	build/walk.go:28 +0x59
github.com/bazelbuild/buildtools/build.walk1(0xc000219530, 0xc00056fa48, 0xc00056fa38, 0x121bc00, 0xc000219480)
```
1. Proper alignment of comments when tabwriter is on.
2. Revision of in and golden files.
@google-cla
Copy link

google-cla bot commented Jun 23, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@google-cla google-cla bot added the cla: no label Jun 23, 2021
@rtabassum
Copy link

@googlebot I fixed it.

@google-cla
Copy link

google-cla bot commented Jun 23, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@SKashyap
Copy link
Author

@googlebot I fixed it.

@google-cla
Copy link

google-cla bot commented Jun 24, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

HariPreethiS and others added 10 commits June 24, 2021 00:53
76 - long tuple entry when sorting
77 - sort tag added on empty table
78 - sort tag added when table in one line
Handle buildifier failure with Panic error when the column number
specified in table sort tag is larger than maximun number of columns
or out of range. Also added unit tests.
When buildifier table sort for a column that has duplicate values,
then buildifier sorts the entries preserving the order of the duplicate keys
from the original table. Also added unitetests.

Unit testcase 81 - duplicate key for the col to be sorted for table sort tag, use case when comment appends multiple times in output
Unit testcase 82 - duplicate key for the col to be sorted for table sort tag - throws error
Design changes made here :
1) Seperate the out the logic required to print tabular data into a new secluded method.
2) Avoid "mid-line" changes of `io.writer` to avoid removal of spaces within statements.

Also make accompanying test changes.
1) Seperate the "Tag detection" logic from the "sorting logic"
2) `formatTables` onlys Walks the AST and marks nodes as `contender for formatting as table`
3) `sortTableRows` takes care of sorting
- Till now only the Table's root node was detected and marked by our logic.
- Rows were printed based on whether "tabWriter" is ON/OFF.
- This makes the logic brittle.
- Add code to detect the children of Table root which act as the rows of the table.
- Marking them gives us a flexible design - that can be extended to [ [ ] ]/( [] ) etc
@google-cla
Copy link

google-cla bot commented Jun 24, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@SKashyap
Copy link
Author

@googlebot I fixed it.

@google-cla
Copy link

google-cla bot commented Jun 24, 2021

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for all the commit author(s) or Co-authors. If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.
In order to pass this check, please resolve this problem and then comment @googlebot I fixed it.. If the bot doesn't comment, it means it doesn't think anything has changed.

ℹ️ Googlers: Go here for more info.

@SKashyap
Copy link
Author

@googlebot I fixed it.

@google-cla google-cla bot added cla: yes and removed cla: no labels Jun 24, 2021
@Gormo
Copy link

Gormo commented Feb 2, 2022

@SKashyap looks like a really nice improvement! What's the status of this?

@SKashyap
Copy link
Author

SKashyap commented Feb 2, 2022 via email

Added test cases for the same.
Does not rewrite the expression, but functions correctly.
@systemlogic
Copy link

@SKashyap have you given up on PR ?

@SKashyap
Copy link
Author

SKashyap commented Jul 4, 2023

@SKashyap have you given up on PR ?

I had stopped working on this for a while. But have made some recent changes to this in our internal repository. Will collate those and wrap this asap.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants