Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FORMATTING] Space between type and options removed in nested BigQuery fields #619

Open
Ewan-Keith opened this issue Jul 10, 2023 · 1 comment
Labels

Comments

@Ewan-Keith
Copy link

When formatting BigQuery DDL statement with OPTIONS on a Struct field, any space between the type definition and the Options definition is removed, resulting in an invalid statement.

Input data

CREATE TABLE `project.dataset.format_test`
(
 str STRING OPTIONS(description="A string field."),
 int INT64  OPTIONS(description="An int64 field."),
 record STRUCT<nested INT64 OPTIONS(description="a struct field")> OPTIONS(description="a struct")
)

Expected Output

CREATE TABLE
  `project.dataset.format_test` (
    str STRING OPTIONS(description = "A string field."),
    int INT64 OPTIONS(description = "An int64 field."),
    record STRUCT<nested INT64 OPTIONS(description ="a struct field")> OPTIONS(description = "a struct")
  )

Actual Output

CREATE TABLE
  `project.dataset.format_test` (
    str STRING OPTIONS(description = "A string field."),
    int INT64 OPTIONS(description = "An int64 field."),
    record STRUCT<nested INT64OPTIONS(description ="a struct field")> OPTIONS(description = "a struct")
  )

Usage

To run this I've stored the input in a file called example-sql.sql and am running:

> sql-formatter --v
12.2.3

> sql-formatter -l bigquery example-sql.sql
CREATE TABLE
  `project.dataset.format_test` (
    str STRING OPTIONS(description = "A string field."),
    int INT64 OPTIONS(description = "An int64 field."),
    record STRUCT<nested INT64OPTIONS(description ="a struct field")> OPTIONS(description = "a struct")
  )
@Ewan-Keith Ewan-Keith added the bug label Jul 10, 2023
@nene
Copy link
Collaborator

nene commented Jul 11, 2023

Thanks for reporting. Definitely a bug.

I remember that the part that deals with this type-parameter formatting was pretty ugly and error-prone. Perhaps there's a simple fix I can make, perhaps not. More likely not. Will look into this.

In the mean time, perhaps you'd like to try out a new SQL formatting tool that I've written, which specifically includes full BigQuery support: https://github.com/nene/prettier-plugin-sql-cst

It formats the SQL somewhat differently, but at least it wont output invalid code. For your SQL it produces the following:

CREATE TABLE `project.dataset.format_test` (
  str STRING OPTIONS (description = "A string field."),
  int INT64 OPTIONS (description = "An int64 field."),
  record STRUCT<nested INT64 OPTIONS (description = "a struct field")> OPTIONS (
    description = "a struct"
  )
);

I'm not really sure whether it would be better to format options as OPTIONS(...) or OPTIONS (...). Or which style is more common in BigQuery community. Open to suggestions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants