Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could we add SchemaUpdateOption param to BQ WriteParam? #5168

Open
arthurenc opened this issue Jan 15, 2024 · 1 comment
Open

Could we add SchemaUpdateOption param to BQ WriteParam? #5168

arthurenc opened this issue Jan 15, 2024 · 1 comment
Labels
enhancement New feature or request gcp
Milestone

Comments

@arthurenc
Copy link

arthurenc commented Jan 15, 2024

CONTEXT

We are trying to use saveAsBigQueryTable to append data to a table where additional fields might be added to the schema. At the moment, we are unable to do so as the schema for the data we want to append to the table has additional fields not present in the table schema.

"errors" : [ {
      "message" : "Provided Schema does not match Table table_name. Cannot add fields (field: field_name)",
      "reason" : "invalid"
    }

SOLUTION

Google offers a parameter SchemaUpdateOptions which allows you to update the schema of a destination table (docs). This parameter can be configured in two ways; ALLOW_FIELD_ADDITION and ALLOW_FIELD_RELAXATION. If we configure it to ALLOW_FIELD_ADDITION then additional nullable fields, that are not already present in the table schema, could be appended using the saveAsBigQueryTable method.

At the moment this is not offered as a configuration option with the scio wrapper. Would it be possible to get this added in?

@arthurenc arthurenc changed the title Could we add SchemaUpdateOption param to BQ WriteParam? Could we add SchemaUpdateOption param to BQ WriteParam? Jan 15, 2024
@RustedBones RustedBones added enhancement New feature or request gcp labels Jan 15, 2024
@RustedBones
Copy link
Contributor

RustedBones commented Jan 16, 2024

Hi @arthurenc,
Thanks for the API change request.

In the meantime, you can leverage the configOverride to modify the underlying beam IO:

.saveAsBigQueryTable(
  table,
  schema,
  configOverride = _.withSchemaUpdateOptions(myUpdate)
)

@RustedBones RustedBones added this to the 0.15.0 milestone Feb 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gcp
Projects
None yet
Development

No branches or pull requests

2 participants